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NATURE OF EVALUATION 


CLIFFORD Woopy 
University of Michigan 


Editor's note: Much is written these days on the general subject of evalua- 
tion. The author presents an analysis of this concept with a number of perti- 
nent suggestions. 


PRELIMINARY to the discussion of the nature of evaluation it seems 
well to emphasize three generalizations which are so evident that they can 
be spoken of as axioms fundamental to any program of evaluation. 


THREE BASIC AXIOMS SHOULD BE CONSIDERED 


1. The schools belong to the people. The school is an agency set up 
by the people to educate the children of the people. The people have voted 
to tax themselves in order to pay teachers and to provide the materials to be 
used in the education of their children. Thus it is well for all to keep in 
mind the fact that the school belongs to the people—not to the teachers. 

2. The people will have the kind of a school that they want. The 
people pay the taxes; they elect the boards of education, the supervisors and 
school officials; in the long run the people decide the kind of a school they 
desire. The superintendent, school officials, and teachers may attempt inno- 
vations in the educational process, but unless the people are convinced of 
the wisdom of such innovations both the innovations and the agencies which 
fostered them will be eliminated. Such conditions make it imperative that 
school officials and teachers create understanding among the people con- 
cerning new values in education. Educational statesmanship demands that 
the school shall be a community enterprise built through the cooperative 
participation of the people, teachers, and school officials. While the school 
officials and teachers may exercise effective leadership, the people, in the 
long run, will determine whether or not such leadership is acceptable. Thus 
if the school officials and teachers wish to stress new values in education, 
they must create demands for such values among the people. 
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3. Instruction almost invariably stresses the values emphasized in evalu. 
ation. If final examinations emphasize only facts, teachers will soon stress 
nothing but facts; if final examinations are given only in the three “R’s,” 
main emphasis in teaching will be on those subjects. Since instruction tends 
to follow the lines stressed in evaluation, it is clear that if so-called ‘new 
values in education are to be emphasized, these values must have a place in 
the scheme of evaluation. There must be instruments for the measurement 
of these values just the same as there are instruments for the measurement 
of achievement in reading, arithmetic, and spelling. 


THE PEOPLE PREFER CERTAIN OUTCOMES TO OTHERS 
The opinion of parents. In view of the fact that the people themselves 
participate in the formulation of school policy and that in the long run 
they sit in final judgment on the survival of such policies, it is imperative 
that schoolmen know what the people desire that their children acquire 
| while in the public schools. Two recent studies in this field will be cited to 
provide an answer to this question. 
At the suggestion of a group of men attending the National Education 
Association meetings, Lester S. Ivins, Dean of Defiance College, sent an 
inquiry to representative parents throughout all parts of the nation asking 
/ the following question, “What are the most important lessons to be taught 
in schools?” The answers’ of these parents are listed below in their rank 
order: 


. Lessons that will impress the value of good character 

That may prevent selfishness towards others 

That will improve or produce good manners 

That teach the value of honesty and truthfulness 

That aid in good sportsmanship 

. That will teach respect for the church, other pupils and authority, 
as well as for the proper kind of government officials 

. That will impress them with the value of cooperation with others 
8. That will teach worthwhile lessons from textbooks 

9. That will teach them facts from magazines and library books that 

contain a lesson of importance 
10. That will show why great men succeeded 


AW bw nN 


It is interesting to note that only two of the items mentioned above make 
any reference to the materials of the textbook. Eight of the ten items 


‘Ivins, Lester S. “What Parents Expect of the School,” Journal of National Edu 
cation Association, XXVIII (October, 1939), 194. 
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emphasize the so-called dynamic qualities, i.e., character, cooperation, good 
sportsmanship, unselfishness, honesty, and respect for personality and 
institutions. 

An investigation somewhat similar to that of Dean Ivins was conducted 
by J. W. Menge, Associate in Evaluation, Michigan Study of the Secondary 
School Curriculum. This investigation involved 1096 responses from the 
parents of pupils in ten representative high schools of Michigan. The re- 
sponses obtained should be reasonably typical of the responses from the 
state as a whole since the high schools selected had enrollments varying 
from few to many pupils, and since the schools were widely distributed 
geographically. 

The questionnaire itself differed from those usually designed for 
establishing the purposes of education in that, instead of having the parents 


TABLE I 


DEGREE OF IMPORTANCE ATTACHED TO SCHOOL’S EFFORT TO HELP IN VARIOUS 
TYPES OF ACHIEVEMENT AS SHOWN BY THE PERCENTAGES IN RESPONSES 
OF 1096 PARENTS FROM 10 COMMUNITIES 


Degree of Importance 
Is it important that the school should help your son |- - 


| ! 
or daughter to learn: | No 
| Great | Some Little | No Opinion 
1. to make intelligent decisions for himself?- =e 90 «CO 7 | 2 1 0 
2. to answer questions about famous authors of the | 
8. to enjoy music? . | 47 45 | 1 1 
4. to plan for himself ways of meeting his own | 
problems both in school and out of school? 92 6 1 0 1 
5. to solve algebra problems? 23 44 21 8 4 
6. to select and to participate in satisfactory | kinds 
of recreation? | 60 32 5 1 2 
7. to be a 11 28 29 20 12 
. to enjoy art?__. 21 7 20 7 5 
9. to read a foreign language’. 19 39 25 13 4 
10. to take part in social affairs with other boys ‘and 
irls? 71 23 4 1 1 
11. to understand and to meet the proble ms related | 
to living in the home?_ 8&2 12 4 1 1 
12. to solve geometry problems? - 20 39 25 11 | 5 
13. to be a skilled musician? | 18 27 30 16 | 9 
14. to collect and use information about his own | 
problems?-. 76 15 7 1 | 1 
15. to write in a foreign language? 11 28 31 23. Ci 7 
16. to cooperate with other boys and girls in work-| 
ing on their own problems? 75 18 4 2 
17. to be an artist?_. 9 23 31 25 12 
18. to cooperate with other boys and girls and with 
adults in working on problems in the com- 
munity? _ 73 22 3 | 1 
19. to understand and make use of important ‘prin- | 
ciples of science that he (or she) can apply in | } 
everyday life? 67 | 26 | 4 1 2 
20 to judge for himself whether his work in school | 
is satisfactory or unsatisfactory? - ~---| 81 14 2 | 2 
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check the importance of statements of general purposes, they were asked 
to give their opinion as to the importance of having their child perform 
certain types of educational tasks or engage in certain types of educational 
and social activities. Each question was prefaced by the question, “Is it 
important that the school help your son or daughter to learn?”, and was 
followed by a list of various tasks and activities. The parents considered 
the specific task or activity, and recorded their judgment by checking one of 
the following responses: Great importance, Some importance, Little im- 
portance, No importance, No opinion. Table I, presenting the answers to 
twenty selected items of the questionnaire, will make clear the nature of the 
instrument and the method of making response. 

Consideration of Item No. 1 of this table shows that 90 per cent of 
1096 parents in these ten communities thought it was of Great importance, 
for the school to help their sons and daughters to make intelligent decisions; 
another 7 per cent thought this activity was of Some importance; 2 per 
cent felt that it was of Little importance; and 1 per cent indicated that it 
was of No importance. These responses indicate that parents almost uni 
versally desired the school to help their sons and daughters make intelligent 
decisions. Such an activity seemed more important to them than that of 
Item No. 2, “to answer questions about famous authors of the past.” 

The reader should give careful consideration to the responses to each 
individual item, but the writer must confine his discussion to a few gen 
eralizations. These tabulations show that: 


1. Ninety per cent or more of these parents indicate that they thought 
it was of Great importance that the school should lead their sons 
and daughters in making intelligent decisions and planning ways 
of meeting problems both in and out of school. 
Over seventy per cent of the parents feel that the following items 
are of Great importance: to take part in social affairs with other 
boys and girls; to understand and to meet the problems related to 
living in the home; to collect and use information about his own 
problems; to cooperate with other boys and girls in working on 
their own problems; to cooperate with other boys and girls and 
with adults in working on problems in the community; and to 
judge for himself whether his work in school is satisfactory. 

3. Only a small percentage of the parents check as of Great importance 
a number of items involving emphasis on the mastery of subject 
matter contained in the textbook. Witness the following: nine per 
cent of the parents check as Great importance the activity of an- 
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swering questions about famous authors of the past; eleven per 
cent, the ability to write in a foreign language; eleven per cent, to 
be a scientist. 


The above generalizations have been based upon the percentage of 
parents who checked Great importance. Although consideration should be 
eiven to other responses, it seems safe to say that these parents place an 
emphasis on that type of instruction which will develop in their sons and 
daughters the so-called dynamic qualities rather than that which will result 


in the 


mastery of factual knowledge. 


The opinion of professional educators. Professor Kilpatrick, in a recent 


‘ published in Progressive Education entitled “The Education to Be 


urticle? 


Sought,” lists as the five principal aims that must guide in the educative 
process the following: 


A well-adjusted personality 


. Rich all-round living, with the necessary techniques to make a success 


of what is to be attempted 
Due regard for others, their rights and feelings 
Acting on thinking; ever better meanings ever better put to use 


. Such living as creatively sprouts other and finer living 


In 1938 the Educational Policies Commission of the National Educa- 
tion Association published a little book® entitled The Purposes of Education 
in American Democracy in which the following statements of objectives 
are listed: 


3 
4 


The objectives of self-realization 

The objectives of human relationship 
The objectives of economic efficiency 
The objectives of civic responsibility 


These objectives have been accepted by the State Department of Public 
Instruction as the values which should be sought in its present program of 
instructional improvement. These aims, if elaborated, are approximately the 
same as those set forth by the parents and by Kilpatrick. The language of 


* Kilpatrick, William H. “The Education to Be Sought.” Progressive Education, 
XVII (January, 1940), 12-17. 

* Educational Policies Commission, National Education Association of the United 
States and the American Association of School Administrators, The Purposes of Educa- 


tion in 


Street, NW. 1938), pp. ix—157. 


American Democracy (Washington, D. C.: The Association, 1201 Sixteenth 
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the laymen and that of the professional educators is slightly different, but 
the meaning of their language is approximately the same. This close accord 
manifested in a study of the lay and professional opinion as to the aims 
of the public school, makes even more significant the implications for 
evaluation. If the schools belong to the people, if the people can have the 
kind of a school they want, and if instruction follows the values emphasized 
by the parents of the school children and the professional educators, then 
these values must be stressed in an adequate program of evaluation. It also 
follows that the program of evaluation must be as broad as the program 
of education, and must stress the same values as the instruction program 


METHOD OF ATTAINING THESE GOALS 


As has been shown in this paper, there is fair agreement between the 
laymen and the professional educators on the goals of education. The differ- 
ence, if one exists, between these two groups, then, lies not in the ultimate 
goals but may lie in their convictions concerning the methods for attaining 
these goals. Certain laymen seem to have great faith in the transfer of 
values gained through studying subject matter and in the disciplinary values 
which accrue from such study. The instructional program toward which we 
are moving in Michigan emphasizes wrestling with the real problems of 
life and learning how to participate in living by helping to solve the prob- 
lems of life. The program is based on the assumption that the best guarantee 
of transfer is having practice in situations most like the ones to which the 
thing learned is to be applied. Since the method of evaluation bears a 
direct relationship to the method for carrying out the goals of education, 
the importance of mutual understanding and agreement between the two 
points of view to the methods as well as the goals of education cannot be 
over-estimated. 


EMPHASIS IN THE 


EVALUATION OF EDUCATIONAL ACHIEVEMENT 


Popular evaluation. Both the laymen and the professional educators 
expend a great deal of energy in evaluating the achievements of the school 
Too often the layman bases his evaluation almost entirely on the cost in 
dollars and cents of procuring desired facilities. When the question is 
raised as to providing health instruction, a more adequate supply of materials 
of instruction or better trained teachers, the question that occurs to many 
laymen is not what contribution to the better education of the children will 
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accrue, but how much the thing desired will raise the tax levy. It seems at 
times that textbooks, teachers, supervision of instruction, research and ad- 
ministration are evaluated solely in terms of the tax dollar. Again, the lay- 
man may base his evaluation of the school upon the failure of a few gradu- 
ates or a few non-graduates who attended the school for a time and who 
failed to make good in the business and social activities of the local com- 
munity. These familiar types of evaluations—and many others can be cited 

have some significance, but all well-informed persons recognize that the 
evaluations are based upon inadequate samplings of relevant and available 
data bearing upon the purposes of education. 

The laymen are not the only persons engaged in the acts of evaluating 
the efficiency of the school. Teachers devote most of their time to making 
evaluations of some type or other, even though these same teachers un- 
thinkingly may deny engaging in activities that have any of the character- 
istics of that new fangled term, evaluation. In a sense, the teacher in almost 
every recitation is attempting to ascertain whether or not the learner has 
mastered what he is supposed to have learned. The teacher on the basis 
of the responses made approves or disapproves of the pupil for his efforts, 
makes a new assignment or orders the lesson taken over, perhaps adding 
some kind of punishment. The ordinary written lesson or the examination 
in common use is another type of evaluation engaged in by every teacher. 
Such examination may involve: material which has been taught or material 
to be applied to new situations, inquiries emphasizing facts or the use of 
data in problem situations, questions stressing either immediate or delayed 
recall. No matter what the nature of the examination may be, it is designed 
to determine whether or not the level of achievement of the pupils in the 
subjects taught is as high as should be expected. On the basis of the deci- 
sions reached, pupils are passed or failed, promoted or not promoted, allowed 
or not allowed to play football or take part in extracurricular activities, 
permitted or not permitted to graduate, elected or not elected to honor 
societies, or admitted or not admitted to colleges or universities. Surely 
sufhcient familiar illustrations have been mentioned to convince one that 
evaluation is an integral part of the instructional process. 


The evaluation of the teacher is probably more adequate and more 
significant than that of the layman, but near the beginning of the present 
century at the dawn of the application of science to the study of educational 
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problems, it became evident that the evaluations usually made by the class. 
room teachers were almost as inadequate as those made by the layman. It 
became clear that the conditions under which the testing was done were 
not standardized, the content of the examinations was not well chosen, and 
the results could not be interpreted because there was no knowledge of the 
type of responses which a child of a given age or grade or a child of 
age-in-grade should make. 


Technical evaluation. Out of the shortcomings of the layman and 
teacher type of evaluation came a desire for more adequate measurements. 
The answer came with the advent of science in education. So-called stan- 
dardized measurement became the answer to a prayer for more exact meas- 
urement. Near the beginning of the present century, Binet needed an instru- 
ment to select the children in Paris who should be placed in institutions for 
the feeble-minded. To aid him in his task he developed a series of tests 
of mental ability known as the Binet 1908 scale. This scale consisted of a 
series of test situations to which the children were exposed. On the basis 
of the responses made by children of various ages he determined norms of 
achievement for these ages. The Binet test differed from the ordinary type 
of examination in that all children were exposed to the same exercises under 
standardized conditions and the results were scored by the same pattern and 
interpreted in terms of given norms of achievement. 


Following the early work of Binet many revisions in tests were made. 
Later group tests of mental ability were developed. Simultaneously with the 
development of tests of mental capacities came the deluge of standardized 
achievement tests. All are familiar with the Thorndike and Ayres Hand- 
writing Scale; the Hillegas Composition Scales; the Courtis, Woody, Cleve- 
land, Monroe Arithmetic Scales; and the Thorndike, McCall, Monroe, 
Burgess Reading Tests. In 1915 when the Woody Arithmetic Scales were 
first put on the market, one could count all of the standardized achieve- 
ment tests on the fingers of his two hands; in 1940 the number of such 
tests on file in the Bureau of Educational Reference and Research at the 
University of Michigan approximates three thousand. 


In the early development of the standardized instruments of measure- 
ment those responsible for making the tests emphasized the most mechanical 
aspects of the subject under consideration. Naturally they attacked the most 
easily measured phases of the subject. The test makers were conscious of 
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the fact that many desirable aspects of the teaching of any subject were not 
emphasized in the tests being developed, but in the early stages of the move- 
ment no one knew how to develop instruments emphasizing the less mechani- 
al aspects of the subjects. Furthermore, if the test makers had attempted 
‘o measure the more subtle values to be obtained from the teaching of the 
various subjects, the techniques to be employed first had to be developed. 
[t is too much to expect that this new movement should spring forth in a 
state of perfection. However, it is ill-becoming a scholar after two decades 
of experience with the use of the instruments of measurement to belittle the 
etforts of these early workers. No one understood the shortcomings or set 
them forth better than some of these early test makers. As evidence of this 
understanding, note the changes which these test makers have made in 
their later tests or which have been made in the tests constructed by indi- 
viduals who received their training at the hands of the pioneers in the field 
of measurement. 

As the use of standardized tests increased and efforts at interpretation 
multiplied, both the values and the inadequacies were evident. Test users 
began to administer a single short test and to assume that a reliable measure 
of the values to be derived from the teaching of a subject had been obtained. 
Test users began to commit unpardonable sin by making the claims they did 
from such inadequate test results. Pupils who happened to make a low 
score on a mental test were termed dull and often were rashly assigned to 
a “Z" section of the class or to an ungraded room. Every effort was made 
to make each pupil in a class attain the norm for the class as a whole when 
it is axiomatic that since the norm of achievement for a class is the median or 
mean of the scores, half of the scores must be below that median or mean. 

Latest developments in standardized measurement. Time does not per- 
mit adequate discussion of the use and abuse of the instruments of the early 
measurements, but during the last decade the necessity of two improvements 
became apparent: (1) the need for additional kinds of data for aiding in 
the interpretation of the results procured, and (2) the need for the develop- 
ment of instruments of evaluation that emphasized the types of values 
essential to democratic living. 

The former need has been met, for example, by the creation of score 
cards for recording behavior in the classroom and playground through the 
time sampling technique; by the creation of the inventory of reading and 
study habits; by keeping behavior journals including essential patterns of 
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behavior in the classroom, in the home and elsewhere; by making longi- 
tudinal studies of the many aspects of achievement of the pupils through 
a period of years; by interpreting the responses of children in terms of pat 
terns of growth; and by observing not only the level of achievement at- 
tained but by noting the process by which the result was achieved. 
Furthermore, new types of standardized or non-standardized tests were con- 
structed to supplement and to throw light on the results achieved on the 
standardized tests. All of these new types of instruments and devices have 
been developed in order to give a better basis for determining the meaning 
and significance of the results obtained on the standardized tests. The use 
of these devices seemed essential to safeguard the wholesome use rather 
than the abuse of desirable instruments of evaluation. 

The latter need, i.e., that of the development of instruments of evalua- 
tion with emphasis on the types of values stressed in the purposes of edu- 
cation essential to democratic living, is being met by the preparation of 
tests for the measurement of cooperation, thinking, interpretation of data, 
the mature of proof, the application of principles, planning, attitudes 
towards community problems, social attitudes, and other tests on the so- 
called dynamic qualities. The recent tests produced by the committees on 
evaluation of the Progressive Education Association are good illustrations 
of these instruments of evaluation. It is interesting to observe that even 
though the Progressive Education Association has been the highest critic of 
the measurement movement, it is now the most prolific agency in pro- 
ducing standardized instruments of measurement. However, it seems fair 
to add that in the opinion of the writer this association is making a real 
contribution to the cause of measurement. 

Caution required in appraising recent efforts. Recent developments in 
evaluation are primarily extensions of earlier efforts. The test makers of the 
Progressive Education Association may make the same kinds of mistakes 
that previous makers of tests have made. The test makers of the Progressive 
Education Association are applying the same technique employed by those 
who developed the standardized achievement tests. The only difference in 
the two groups of workers is in the type of values being measured. Instead 
of measuring mechanical processes of arithmetic or reading, as was done by 
the pioneers of the testing movement, the test makers of the Progressive 
Education Association are attempting to measure cooperation, planning, so- 
cial sensitivity, or interests. They are using the same statistical techniques, 
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the same formulae for standardization and interpretation. In most cases 
they are attempting to develop instruments of measurement in the abstract 
divorced from purpose or from the activities of group living. They stress 
end points of achievement rather than the process of achieving. They em- 
phasize isolated elements of behavior rather than an element in relation to 
the behavior of the organism as a whole. It may be that.the present efforts 
of the Progressive Education Association are conditioned by the specific 
bond rather than by the organismic theory of psychology. For further dis- 
cussion of this line of thinking, see the article by Orata on “Evaluating 
Evaluation” in the Journal of Educational Research. 


CONCLUSION 


This discussion began with the assumption that the schools belong to 
the people, and that the people will have the kind of schools that they want. 
It was asserted that the emphasis on instruction tends to follow the values 
stressed in the measurement of the educational products. Data were then 
presented to show that the school public and the professional educators had 
agreement in the values which should be stressed in education, and it was 
suggested that, while there was agreement in the general objectives, there 
may be difference in the means of achieving these objectives. Effort was then ) 
made to show evolution in the process of evaluation of achievement in the 
schools. It was pointed out how instruments of measurement gradually 
evolved from simple uncontrolled processes of evaluation emphasizing me- 
chanical aspects of the teaching of the various subjects to the carefully con- 
trolled and administered instruments of evaluation emphasizing many types 
of behavior essential to democratic living. Finally a warning was sounded 
suggesting that the later development in measurements may miss the mark 
by neglecting purpose, process of achieving, and the relation of a given 
response to the responses of the organism as a whole. 

In general the gist of this discussion may be summed up in two gen- 
eralizations: (1) All education should contribute to those values which 
are essential to successful living in our democracy; (2) The program of 
evaluation within the program of the school should emphasize the same 
values as are stressed in life, should have its setting in the life of the 
school and community living, and should be as broad as such worthy living. 


*Orata, Pedro T. “Evaluating Evaluation,” Journal of Educational Research, | 
XX XIII (May, 1940), 641-661 
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GENERAL STATEMENT ON EVALUATION 


RALPH W. TYLER 
University of Chicago 


Editor's note: Measurement in a “setting” is one of the important ad 
vances of the decade. The author describes the purposes, techniques, and 
assumptions in this important educational advance. 

WHEN Professor Woody asked me to contribute a general statement 
on evaluation as part of this special issue of the Journal of Educationa 
Research, 1 wrote him that I had so frequently set forth my conception of 
evaluation that I feared another statement would be “old stuff” to the 
readers of the Journal. However, Mr. Woody replied that a formulation to 
appear as part of this issue would be desirable even though the elements 
have already been presented in other publications. It is my hope that this 
statement, though not new, will serve as a succinct exposition of one point 
of view regarding evaluation. 


PURPOSES OF EVALUATION 


In perceiving the appropriate place of evaluation in modern education 


consideration must be given to the purposes which a program of evaluation 
may serve. At present the purposes most commonly emphasized in schools 
and colleges are the grading of students, their grouping and promotion, 
reports to parents, and financial reports to the board of education or to the 
board cf trustees. A comprehensive program of evaluation should serve a 
broader range of purposes than these. 


One important purpose of evaluation is to make a periodic check on 
the effectiveness of the educational institution, and thus to indicate the 
points at which improvements in the program are necessary. 


Another important purpose of evaluation which is frequently not recog 
nized is to validate the hypotheses upon which the educational institution 
operates. A school or college organizes its curriculum on the basis of a 
plan which seems to the staff to be satisfactory, but in reality we do not 
yet know enough about curriculum construction to be sure that a given plan 
will work satisfactorily in a particular community. On that account, the 
curriculum of each school or college is based upon hypotheses, that is, the 
best judgments the staff can make on the basis of information it has. In 
some cases these hypotheses are not valid, and the educational institution 
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may continue for years utilizing a poorly organized curriculum because no 
-areful evaluation has been made to check the validity of the hypotheses on 
which the curriculum is operating. Similarly, a program of guidance in any 
school system is largely based on hypotheses which have not been adequately 
validated, and again the effectiveness of the program may be greatly reduced 
because some of these hypotheses are not valid. Furthermore, many of our 
administrative policies and practices are based upon judgments which in a 
particular case may not be sound. Every educational institution has the re- 
sponsibility of testing the major hypotheses upon which it operates and of 
adding to the fund of tested principles upon which schools may better 
operate in the future. 


A third important purpose of evaluation is to provide information “~ 


basic to effective guidance of individual students. Only as we appraise the 
student's achievement and as we get a comprehensive description of his 
growth and development are we in a position to give him sound guidance. 
This implies evaluation sufficiently comprehensive to appraise all the sig- 
nificant aspects of the student’s accomplishments. Merely the judgment that 
he is doing average work in a particular course is not eaomah, We meed— gh. We nee 
to find out more accurately where he is progressing and where he is having 
difficulties. 


security to the school or college staff, to the students, and to the parents. 
The responsibilities of an educational institution are broad and involve 
aspects which seem quite intangible to the casual observer. Frequently the 
staff becomes a bit worried and is in doubt as to whether it is really ac- 
complishing its major objectives. This uncertainty may be a good thing 
if it leads to a careful appraisal and constructive measures for improvement 
of the program; but without systematic evaluation the tendency is for the 
staff to become less secure and sometimes to retreat to activities which give 
tangible results although they may be less important. Often we seek se- 
curity through emphasizing procedures which are extraneous and sometimes 
harmful to the best educational work of the school. Thus, high-school 
teachers may devote an undue amount of energy to coaching for scholarship 
tests or college entrance examinations because the success of students on 
these examinations serves as a tangible evidence to the teacher that some- 
thing has been accomplished. However, since these examinations may be 


A fourth purpose of evaluation is to provide a certain psychological —~ 
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appropriate for only a portion of the-high-schee!-student-body,—concenta- 


tion of attention upon them may actually hinder the total educational pro 
gram of the high school. For such teachers a comprehensive evaluation 
which gives a careful check on all aspects of the program would provide 
the kind of security that is necessary for their continued growth and self 
confidence. Students and parents are also subject to this feeling of inse- 
curity and in n many cases desire some kind of tangible evidence that the 
educational program is effective. If this is not provided by a a comprehensive 
plan of evaluation, then students and parents are likely to turn to tangible 
but extraneous factors for their security. 


A fifth purpose of evaluation which should be emphasized is to provide 
a sound basis for public relations. No factor is as important in establish- 
ing constructive and co-operative relations with the community as an under- 
standing on the part of the community of the effectiveness of its educational 
institutions. A careful and comprehensive evaluation should provide evi 
dence that can be widely publicized and used to inform the community 


about the value of the school or college program. Many ‘of the criticisms 
expressed by patrons and parents can be met and turned to constructive co- 


operation if concrete evidence is available regarding the accomplishments 
of the school or college. 


A sixth purpose of evaluation is to help both teachers and pupils to 
clarify their purposes and to see more concretely the directions in which 
oe are moving. Appraisal demands a clear conception of the results hoped 

; hence both teachers and pupils are stimulated by an evaluation pro 
gram to define these anticipated results. This definition of results sought 
serves to guide the efforts of both teacher and learner. For this reason the 
participation of both teachers and pupils in planning and conducting 
evaluation processes is of vital importance, 

Evaluation can contribute to these six purposes. It can provide a 
periodic check which gives direction to the continued improvement of the 


program of the school or college; it can help to validate some of the im 
portant hypotheses upon which the program operates; it can furnish data 
about individual students essential to wise guidance; it can give a more 
satisfactory foundation for the psychological security of the staff, of parents, 
and of students; it can supply a sound basis for public relations; and it can 
help both teachers and pupils to clarify their goals. For these purposes 
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to be achieved, however, they must be kept continually in mind in planning 
and in developing the program of evaluation. The decision as to what is 
to be evaluated, the techniques for appraisal, and the summary and interpreta- 
tion of results should all be worked out in terms of these important purposes 


UNDERLYING ASSUMPTIONS 


In the development of a program for evaluating the outcomes of gen- 
eral education, certain basic assumptions are helpful. Six of them are of 


particular importance. In the first place, it is assumed that education is a — 


process which seeks to change the behavior pattern of human beings. It is 
obvious that we expect students to change in some respects as they go 
through an educational program. An educated man is different from one 
who has no education and presumably this difference is due to the educa- 
tional experience. It is also generally recognized that these changes brought 
about by education are modifications in the ways in which the educated 
man reacts, that is, changes in his ways of behaving. Generally, as a result 
of education we expect students to recall and to use ideas which they did 
not have before, to have developed various skills, as in reading and in writ- 
ing, which they did not previously possess, to have improved their ways of 
thinking, to have modified their reactions to aesthetic experiences as in the 
arts, and sc on. It seems safe to say on the basis of our present conception 
of learning, that education when it is effective, changes the behavior patterns 
of human beings. 


A second basic assumption involved in evaluation is that the kinds of 
changes in behavior patterns in human beings which the school or college 
seeks to bring about are its educational objectives. The aims of any educa- 
tional program cannot well be stated in terms of the content of the program, 
or in terms of the methods and procedures followed by the teachers, for these 
are only means to other ends. Fundamentally, the purposes of education 
represent these changes in human beings which we hope to bring about 
through education. The kinds of ideas which we expect students to get and 
to use, the kinds of skills which we hope they will develop, the techniques 
of thinking which we hope they will acquire, the ways in which we hope 
they will learn to react to aesthetic experiences—these are illustrations of 


educational objectives. 
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A third basic assumption is that an educational program is appraised 
" by finding out how far the objectives of the program are actually being 
realized. Since the program seeks to bring about certain changes in the 
behavior of students, and since these are the fundamental educational objec 
tives, then it follows that an evaluation of the educational program is 1 
process for finding out to what degree these changes in the students are 
actually taking place. 


The fourth basic assumption is that the way in which the student or 
ganizes his behavior patterns is an important aspect to be be appraised. There 
is always the danger that the identification of these various types of obje: 
tives will result in their treatment as isolated bits of behavior. Thus, the 
recognition that an educational program seeks to change the student's in 
formation, skills, ways of thinking, attitudes, and interests, may result in 
an evaluation program which appraises the development of each of these 
aspects of behavior separately, and makes no effort to relate them. We must 
not forget that the human being reacts in a fairly unified fashion; hence, in 
any given situation information is not usually separated from skills, or from 
ways of thinking, or from attitudes, interests, and appreciations. For ex- 
ample, a student who encounters an important social-civic problem is ex- 
pected to draw upon his information to use such skill as he has in locating 
additional facts, to think through the problem critically, to make choices 
of courses of action in terms of fundamental values and attitudes, and to be 
continually interested in better solutions to such problems. This clearly in- 
volves the relationship of various behavior patterns and their better in- 
tegration. So that this interrelation will not be neglected it seems necessary 
to emphasize as a basic assumption that the way in which the student relates 
his various reactions is an important aspect of his development and 
important part of any evaluation of his educational achievement. 


A fifth basic assumption is that the methods of evaluation are not 
limited to the giving of paper-and-pencil tests; any device which provides 
valid evidence regarding the progress of students toward educational objec- 
tives is appropriate. As a matter of Practice, Most programs | of appraisal 
have been limited to written examinations or paper-and-pencil tests of some 
type. Perhaps this has been due to the long tradition associated with written 
examinations or perhaps to the greater ease with which written examinations 


may be given and the results summarized. However, a consideration of the 


kinds 
exam: 
objec 


and 1 


of W 


prove 
i 


repel 
rang 
reco! 
activ 
shou 


kind 


and 
valu 
proj 
clat 
the 
the 
teac 


sch 


| 
Marc?P, 
in m 
vario' 
of st 
| 
| 
is 1 
the 
ced 
nec 
*the 
sta 
po 
me 
ev. 
ted 


= 


March, 1942] GENERAL STATEMENT ON EVALUATION 497 


kinds of objectives form 
examinations are not likely to provide an adequate appraisal for all of these 
objectives. A written test may be a valid measure of information recalled 
and ideas remembered. In many cases too, the student's skill in writing and 
in mathematics, may be shown by written tests, and it is also true that 
various techniques of thinking may be evidenced through more novel types 
of written test materials. On the other hand, evidence regarding the im- 
provement of health practices, regarding better personal-social adjustment 
of students, regarding interests and attitudes, may require a much wider 
repertoire of appraisal techniques. This assumption emphasizes the wider 
range of techniques which may be used in evaluation such as observational 
records, anecdotal records, questionnaires, interviews, check lists, records of 
activities, products made, and the like. The selection of evaluation techniques 
should be made in terms of the appropriateness of that technique for the 
kind of behavior to be appraised. 

A sixth basic assumption is that the participation of teachers, pupils, 
and parents in the processes of evaluation is essential to derive the maximum 
values from a program of evaluation. They all have a stake in the educational 
program of school or college. They can all contribute to the formation and 
clarification of objectives, they are all in a position to obtain evidence about 
the progress pupils are making, they can all benefit from efforts to interpret 
the results of appraisal. The processes or evaluation help to guide both 
teachers and pupils and may help parents in understanding the work of the 
school. Finally, the development of an increasing degree of self-evaluation 
is in itself a major goal of democratic education. 


A comprehensive program of evaluation utilizes other assumptions but 
these six are of particular importance because’ they suggest the general pro- 
cedure by which an evaluation program can be developed.*They show the 
necessity of basing an evaluation program upon educational objectives, and 

4they indicate that educational objectives for purposes of evaluation must be 
stated in terms of changes in behavior of students ;*they emphasize the im- 
portance of the relation of various aspects of behavior rather than the treat- 
ment of them in isolation, they make clear the possibility of a wide range of 
evaluation techniques, and hey suggest the co-operative responsibilities of 
teachers, pupils, and parents. 
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EVALUATION PROCEDURE 


The general procedure followed in evaluation involves several major 
steps. It is first necessary for the school to formulate a statement of its edu 
cational objectives, then these statements of objectives are classified into 
major types. Without effort at classification, the objectives are likely to be 
of various levels of generality and specificity, and too numerous for prac. 
ticable treatment. Furthermore, the classification into types of objectives 
indicates the kinds of evaluation procedures essential to an adequate appraisal. 

The next step is to define each of these types of objectives in terms of 
behavior. This step is mecessary because in any list, some objectives are 
likely to be stated in terms so vague and nebulous that the kind of behavior 
they imply is not clear. Thus, a type of objective such as the development 
of effective methods of thinking may mean different things to different 
people. Only as “effective methods of thinking’ is defined in terms of the 
range of reactions expected of students can we be sure what is to be 
evaluated under this classification. 


After a clear definition of the kinds of behavior we are trying to 
appraise has been obtained, the next problem is to identify situations in 
which students can be expected to display these types of behavior so that 
we may know where to go to obtain evidence regarding this objective. If 
each objective has been clearly defined, this step is not difficult. For ex- 
ample, if our definition of objectives has identified as one educational goal, 
the ability to locate dependable information relating to specified types of 
problems, then it seems obvious that a situation which would give students 
a chance to show this ability would be one in which they were asked to 
find information relating to these specified problems. 

One value of this step is to suggest a much wider range of situations 
which might be used in evaluation than have commonly been utilized. By 
the time this step has been completed, there will usually be listed a con- 
siderable number of types of situations which give students a chance to 
indicate the sort of behavior patterns they have developed. These can be 
considered potential “‘test situations.” 

The next step in this general evaluation procedure involves the selection 
and trial of promising methods for obtaining evidence regarding each type 
of objective. Before attempting to construct any new evaluation instruments, 
it is generally a good plan to examine tests and other instruments already 
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Jeveloped to see whether they will serve as satisfactory means for appraising 
he objective. Any group working on an evaluation program will find useful 
bibliography of evaluation instruments such as the Buros Mental Measure- 
nents Yearbook.» This bibliography not only lists tests and other appraisal 
nstruments which are commercially available, but also includes several 
critical reviews of each test written by teachers, curriculum constructers, and 
' test makers. These reviews help in selecting from available instruments those 
d which might be worth a trial. 
Usually at this point it is found that no tests are available to measure 
ertain of the objectives emphasized. In such cases it becomes necessary to 
; construct additional new instruments in order to make a really comprehen- 
sive appraisal. In constructing these instruments it is helpful to set up 
some of the potential test situations suggested in the preceding step and 
ictually to try them out with students to see how far they can be used 
effectively. 

The next step is to select on the basis of this preliminary trial the 
more promising appraisal methods for further development and improve- 
ment. The basis of selection will include the degree to which the appraisal 
method is found to give results consistent with other evidences regarding 
the student's attainment of this objective and the extent to which the ap- 
praisal method can be practicably used under the conditions prevailing in 
the school or college. 


An important problem in the refinement and improvement of an evalua- 
tion instrument is the determination of the aspects of student behavior to 
be summarized and the decision regarding the units or terms in which each 
aspect will be summarized. The reaction of a human being in any test situa- 
tion is sufficiently complex so that several aspects could be measured and 
several possible units of measurement could be used. The choice should be 
made in terms of the significance of the several aspects and the appropriate- 
ness of the results. Another task in refining and improving an evaluation 
instrument is to make revisions which give more clear-cut measures, which 
provide a more representative and adequate sample of the student's reaction 
and which improve the ease with which the instrument can be used. 


* Buros, Oscar K., Editor. The Mental Measurements Yearbook. Highland Park, 
New Jersey, 1941. 
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A final step in the procedure of evaluation is to devise means for in- 
terpreting and using the results of the various instruments of evaluation 
The previous steps have resulted in the selection of, or the development of 
a range of procedures which can be used periodically in appraising the degree 
to which students are acquiring the several important educational objectives 
These instruments will give a series of scores and descriptions which will 
serve to describe and to measure various aspects of the behavior patterns of 
the students. Presumably each of these scores or verbal summaries can be 
compared with scores or verbal summaries previously obtained so that some 
evidence of change or growth of students is available. However, the mean 
ing of these scores, that is, their significance in interpretation becomes fuller 
through various sorts of studies. One such study is the development of 
norms, that is, the identification of scores typically made by students in 
similar classes, in similar institutions, or with other similar characteristics 
Another helpful study is one involving the typical growth or changes made 
in these scores from year to year. A third study involves the interrelation- 
ship of several scores to identify patterns. It is important in this step to 
examine the progress students are making toward each of the several objec- 
tives in order to get more clearly the pattern of development of each student 
and of the group as a whole and also to obtain hypotheses which help to 
explain the types of development taking place. An important purpose of 
evaluation is to provide evidence which suggests hypotheses for the modifica 
tion and improvement of the curriculum. Each school and college needs to 
develop methods for interpreting and using the results of appraisal so as 
to improve the educational program and to guide individual students more 
wisely. 

CONTINUAL EVALUATION 


This brief description of the steps followed in evaluation should have 
indicated that the process of evaluation is an integral part of the educational! 
process. It does not mean simply the giving of a few ready-made tests and 
the tabulations of resulting scores. It is a recurring process involving the 
formulation of objectives, their clearer definition, plans to study students 
reactions in the light of these objectives, continued efforts to interpret the 
results of such appraisals in terms which throw helpful light on the educa 
tional program and on the individual student. This sort of procedure goes 


on as a continuing cycle. Studying the results of evaluation often leads to 
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a reformulation and some improvement in the conception of the objectives 
to be obtained. The results of evaluation and any reformulation of objec- 
tives will suggest desirable modifications in the teaching and in the educa- 
tional program itself. Modifications in the objectives and in the educational 
program will result in corresponding modifications in the plan and program 
of evaluation. So the cycle goes on. 


This program of evaluation is also a potent method of continued 
teacher education. The recurring demand for the formulation and clarifica- 
tion of objectives, the continuing study of the reactions of students in terms 
of these objectives, and the persistent attempt to relate the results obtained 
from various sorts of measurement are all means for focusing the interests 
and efforts of teachers upon the most vital parts of the educational process. 
Evaluation provides a means for the continued improvement of the program 
of education, for an ever deepening understanding of students with a 
consequent increase in the effectiveness of our educational institutions. 
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EVALUATION OF GENERAL EDUCATION IN COLLEGES 


ALvin C. EuricH 
Stanford University 


Editor's note: The evaluation movement has many applications. The 
author explains the application of the evaluation concept to general education 


With American colleges and universities so utterly confused on the 
nature and scope of an adequate general education, the subject of evaluating 
such programs may seem to some readers to be prematurely injected into 
this symposium. On the contrary, the fact that college authorities do not 
agree as to what constitutes general education has in itself stimulated more 
extensive appraisals of the programs now offered. Also, the experimental 
approaches are so definitely spotlighted that those responsible for them ar: 
virtually forced to collect evidence showing the extent to which they accom 
plish what they set out to do. Colleges and universities offering the more 
conventional patterns of work are not faced with the same demands. Their 
efforts receive general approval even though little evidence of their effective 
ness is available. The most comprehensive evaluations of general education 

) have, therefore, been developed in institutions offering the newer programs 


: Some of the major approaches designed to evaluate college programs 
of general education are summarized briefly in this article. They range all 


the way from mere reiterations of beliefs in a program to comprehensive 
and systematic appraisals. 


Appratsal in Terms of Expressed Beliefs. Although not carried on 
systematically, the major form of appraising programs of general education 
is a simple expression of beliefs. College faculties generally decide that the 
program they offer is adequate or inadequate wholly through the beliefs of a 
dominant group or a majority. This method is apparent in the extreme in 
the Hutchins’ school which has demanded attention during the past decade 
Their general formula is first, to denounce the prevailing programs; second, 
to advance their system as the only solution. 


For example, Hutchins in a recent article generalizes as follows: 


“Since our students have lived up to our expectations, we have suc- 
ceeded in postponing maturity to a date undreamed of in the Middle Ages, 
or even in Europe today. The American college senior is two or three years 
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less grown up than his French or British contemporary. In ability to use 
his mother tongue and the other instruments of intellectual operation he 


does not at all compare with them.”’? 


Evidence for this sweeping generalization is wholly lacking. In fact, the 
author even fails to consider the degree to which the French and British 
contemporaries of college seniors are selected in comparison with those in 


America. 

If Hutchins had gone on to appraise the program in which he believes, 
namely that in operation at St. John’s, he undoubtedly would have said that 
the seniors there are equal if not superior to their French and British con- 
temporaries. This is, by implication at least, what his collaborator, Adler, 


meant when he wrote: 


“We shall not have genuine universities again until all the preparatory 
. stages of education are radically reformed, until the college, above all, is 
: restored to its liberal function. . . . Only one college in this country, St. 
; John’s at Annapolis, is working for the revival of a liberal curriculum. Only 

the University of Chicago has throughout its history manifested devotion to 
the true function of a university—formation of fundamental doctrines, de- 
bate of the most serious issues . . . If in some way the spiritual union of 
St. John’s and Chicago could be consummated, we might hope for the 
blessed event of a cultural rebirth.”? 


Ordinarily in summaries of studies no attention is given to appraisals 
of this nature. In an evaluation of general education, however, they cannot 
be ignored because they have been so convincingly stated and so often re- 
peated by persons in high places that they are being accepted as observa- 
tions based upon facts. Up to now, they are still only beliefs and from the 
standpoint of evaluating general education should be regarded as such. For 
evidence, the results from other approaches need to be studied. 

Appraisal in Terms of Courses Offered and Required. Another com- 
mon means of appraising a program of general education is that of examina- 
ing the courses offered and required. In this, the strongest and most influ- 
ential institutions set the pattern. To illustrate, the accepted program of gen- 
eral education for the California junior college consists of the lower di- 


, _— Robert M. “Education for Freedom,” Harpers, CLXXXIII (October, 
1941), 523. 

‘ ste Mortimer J. “The Chicago School,” Harpers, CLX XXIII (September, 
1941), 387-8. 
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vision requirements at the University of California. These include: general 
university requirements covering English composition, military or naval 
science and tactics, and physical education; at least 15 units in not more 
than two foreign languages with some reduction for high school work in 
this field; elementary algebra and plane geometry if not taken in high school; 
at least 12 units in natural science; and a year course chosen from three of 
the following five groups: English and public speaking, foreign language, 
mathematics, social sciences, philosophy. 

Similarly in other sections of the country, one institution or an ac- 
crediting agency sets the pattern for judging the adequacy of a program of 
general education. There are at least two major difficulties with this ap 
proach: (1) the fact that a student has registered for and “passed” a group 
of courses is no guarantee, as a number of studies have shown in recent 
years, that he has acquired the general education implied by the require. 
ment, and (2) the program set by the dominant institution or organization 
may be in a large part the result of faculty compromises in an effort to recog- 
nize various subjects rather than a carefully considered program in terms of 
what it would do for the students. In spite of these difficulties, this approach 
together with expressed beliefs constitutes the major method for evaluating 
general education. As long as such is the case, general education for Ameri 
can colleges and universities will remain in a confused state. 


Appraisal in Terms of the Quality of Students Attracted by the Pro- 
gram. The extensive use in recent years of psychological or scholastic 
aptitude tests has led some individuals to evaluate a program of general 
education on the basis of the scores students at the institution make on tests 
If the scores are high, they assume that the program is not only adequate 
but superior. If the scores are low, they assume that it must be adequate 
Clearly either the process of selecting students or the multiplicity of factors 
attracting them is more important in producing relatively high or low scores 
for a given population than the program of general education. Although the 
latter may be one factor, it is practically never isolated as such and cannot, 
therefore, be used as the sole index for the adequacy of a program. In fact, 
it is conceivable that a college where students make relatively low scores on 
standard tests may offer an outstanding program of general education for 
the students it serves whereas another college whose students make high 
scores may offer a program that is mostly borrowed from another institution 
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and not at all adequate for its clientele. This method, therefore, is not only 
inadequate, it is misleading—particularly if used alone. It may have some 
value when used with other methods. 

Appraisal by Means of a Comprehensive Testing Program Using Stand- 
wd Devices. Many colleges are now using tests such as the Cooperative Gen- 
eral Culture Test as the major means of evaluating their programs of gen- 
eral education. With all their limitations these tests are far superior to the 
usual practices of appraisal which are exemplified by the methods described 
above. Their limitations are largely reflections of the limitations of the col- 
lege curricula in general education. The tests cover, primarily, knowledge 
of history, social studies, of the outstanding foreign literatures of the world, 
fine arts, mathematics and natural science. In so far as the tests stress the 
same objectives as the courses im general éducation, they are not only ade- 
quate but the best instruments available; in so far as the courses stress 
additional objectives not covered by the tests, they are inadequate and need 
to be supplemented with other observational devices. 

By far the most extensive appraisal of general education by the use of 
the comprehensive examination technique is that of Pennsylvania high 
schools and colleges conducted by Learned and Wood.* In this well-known 
investigation comprehensive objective knowledge tests covering the usual 
subject matter areas of general education as offered in most colleges were 
given to more than 55,000 individuals, 3000 of whom were given the same 
tests more than once. The outstanding fact derived from this study is the 
enormous variability (1) among institutions and (2) with each institution. 

The authors state clearly the inference they draw from this fact: 


“Each individual has some level peculiar to himself at which his edu- 
cation in any given subject must begin. Average levels, like the ‘average 
man’, do not exist for practical education. There exist only different start- 
ing points from which alone progress is enya This suggests that instead 
of expecting the members of a college class to conform to an average, we 
might better arrange circumstances so that each student could make full 
use of what he has learned and could advance from the point where he 
really stands. His permanent gains derived from schooling would thus be 
substantially increased. 

“This suggestion is reenforced by the relentless operation of what 
might be called the first law of learning: Whatever a man learns he must 
* Learned, William L. and Wood, Ben D. The Student and His Knowledge. New 
York: Carnegie Foundation for the Advancement of Teaching, 1938. Pp. 406. 
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learn for himself. In Pennsylvania, certainly, neither professors nor other 

institutional agencies have been able to do students’ learning for them. 
Although they have divided their subjects into year-groups and classes, re- 
quired their atteridance on instruction, marked their recitations and tests, 
and granted them a corresponding number of ‘standard’ credits, many of 
these students, in spite of college diplomas, are nevertheless no better in- 
formed on the subjects tested than a large number of pupils still in high 
school; almost one-sixth even lost ground academically during a two-year 
trial period.’’* 

Clearly in terms of this study, if knowledge is an important outcome in 
general education, colleges have some responsibility for attempting to make 
their programs more effective. If, on the other hand, they say that knowledge 
is only one aspect of their program, then their responsibility is just as great 
for expanding the scope of their evaluation program in order that evidence 
can be obtained to show the extent to which all major objectives are being 
achieved. For this purpose, the method of comprehensive subject matter 
tests must be supplemented with other means of appraisal. 

Appraisal by Means of Comprehensive Examinations Designed : 
Measure Student Achievement in Relation to Course or Area Objective 
Under the New Plan of General Education adopted by the University of 
Chicago in 1931, provision was made for measuring student attainment with 
comprehensive examinations. A relatively independent Board of Examina 
tions was established to formulate the policies which are administered by a 
Chief Examiner. From the outset in the construction of examinations con 
siderable emphasis was placed upon the extent to which each examination 
measures the objectives which the instructors have in mind, and upon the 
consistency with which the examinations record the achievement of 
individual students. According to a statement issued by the Chief Examiner 
in 1933, 


“The examiners favor the principle that examinations should place 
emphasis on the student's ability to reason with the principles of his subject, 
rather than the ability merely to repeat factual material by rote memory. One 
cannot demonstrate mastery of any subject without some factual material. 
The student should be asked not only to supply whatever factual material 
he is expected to have mastered but also to deal with it in such a manner as 
to involve, whenever possible, some degree of reasoning.’ 


‘ Ibid., p. 44. 
* Boucher, Chauncey S. and Brumbaugh, A. J. The Chicago College Plan. Chicago 
University of Chicago Press, 1940. Pp. 91-2. 
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These comprehensive examinations have made possible extensive studies 
of student accomplishment and progress. For example, the College operates 
on the assumption that it is mot necessary to register for courses in order to 
pass examinations. In the records for 94 students who received the College 
Certificate with honors, there are 66 instances among the 693 examinations 
‘hey wrote in which a passing grade was attained without registration for 
iny part of a corresponding course, 23 instances after one quarter of registra- 
tion and 59 after two quarters. Also 34 of 656 students completed the 
College requirements in less than six quarters, the number normally re- 
juired; 8 completed the requirements in three quarters or in one academic 
year. Such evidence substantiates at least one of the educational assumptions 
on which the College is operating. In other words, a general education as 
defined by the Chicago curriculum and comprehensive examinations can be 
acquired without working through a specified and rigidly set number of 
courses, quarters or semesters, Or years. 

The plan of the General College organized at the University of Min- 
nesota in 1932 likewise provided for comprehensive examinations in place 
of course credits as the basis for determining student progress. Course tests 
were not eliminated; they were used largely to give the student some know!- 
edge of his readiness for taking the comprehensive examinations. They fur- 
nished, too, a fertile ground for trying out test forms before incorporating 
them in the more comprehensive situations. 

The Minnesota plan for constructing examinations may be summarized 
briefly in terms of the following phases: 

. Formulation of course objectives. 

Formulation of the objectives for the comprehensive examinations. 

. Definitions of each objective in terms of student behavior. 

. Collection of material for each course in terms of objectives, such 

as vocabulary, facts, apa gs illustrations, and problems. 

5. Formulation of specific test items calling for student responses to 
measure each objective. 

6. Evaluation, by the specialist in the field, of each test item in terms 
of subject matter and the objective being measured. 

7. Administration of the tests to the students as short quizzes, mid- 
term or final examinations. 

8. Evaluation of each test item in terms of its discriminative power. 
The total score on al] items within a given section was used as 
the criterion for the evaluation of items within that section. 

9. Determination of the reliability of each test. 
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10. Selection of test items on the basis of their discriminative power, 
objective, and subject matter for inclusion in the comprehensive 
examinations. 


11. Further evaluation of items in the comprehensive examinations and 
of the total tests. 


12. Filing, for further use, of test items under field and objective. 


In the administration of the program the preparation of each test in- 
volved at least four and sometimes more persons: an instructor who was the 
specialist in subject matter, a research coordinator concerned with all the 
content in a comprehensive area, a graduate assistant with considerable train- 
ing and experience in the field, and an examination counselor who had ex- 
perience and interest in the field for which he advised and who had made a 
special study of examinations. Thus the subject matter specialists and 
evaluation consultants worked together.*® 


The examinations developed were used for judging student progress, 
for appraising the effectiveness of courses in attaining various objectives, 
and for studying systematically a wide variety of problems that arose in 
relation to the new program. The most significant result of such procedures 
is undoubtedly the effect upon the faculty. When they are constantly at 
tempting to clarify their conception of what they are trying to achieve, to 
interpret that conception in terms of student behavior and to devise situa- 
tions in which they may observe whether or not the student manifests the 
behavior desired, they develop an alertness toward the teaching of student 
that is seldom attained through any other process. 


Appraisal in Terms of Subsequent Academic Achievement. As parts of 
other studies, programs of general education are at times evaluated in terms 
of the subsequent academic achievement of graduates and former students 
For example Fred C. Zapffe, Secretary of the Association of American 
Medical Colleges, summarizes as follows: 


“Careful study of results over a long period of years has shown that 
students who have neglected the cultural subjects do not do as well in 
medical school as do their fellows who have pursued the opposite course 
So, we find, that the ranking group of students in medical school are the 
bachelors of arts; then come the three-year college men; then the two-year 

*Eurich, Alvin C. and Johnson, Palmer O. “The Experimental Examination Pro 
gram in the General College.’ Chapter III in The Effective General College Curri 
culum as Revealed by Examinations. Minneapolis: University of Minnesota Press, 1937 


men 


whic 


Furt 


eral 


M 
ing 

of e 

Gra 
prel 
had 

stuc 

whi 
effe 
pro 

qu 

for 
62 
Sul 
as 

fa 
fo 
ch 
th 
ac 
at 
ti 
N 
la 


March, 1942] GENERAL EDUCATION IN COLLEGES 509 


men; and fourth, and last, the bachelors of science. This is not an excep- 
tional occurrence. It has happened that way for the past ten years during 
which this study has been made.” 


Furthermore, the evidence from experimental colleges such as the Gen- 
eral College at the University of Minnesota indicates that students follow- 
ng these programs do as well in their subsequent academic work as students 
of equal scholastic ability who have pursued the usual prerequisite patterns." 

Appraisal in Terms of Follow-up Studies of Former Students and 
Graduates. In an effort to gain a type of appraisal not provided by the com- 
prehensive examinations, the University of Chicago sent a questionnaire 
covering most of the important phases of the New Plan to students who 
had completed the work for the College Certificate. The replies from 1,065 
students were “both gratifying and valuable—gratifying because, on the 
whole, they indicate that in the opinion of students the plan is operating 
effectively; valuable because they call attention to certain aspects of the 
program in which further improvement can be made.” In response to the 
question as to whether they were satisfied with the introductory courses, 
91 per cent answered ‘‘yes” for the biological sciences course, 75 per cent 
for the humanities course, 63 per cent for the physical sciences course, and 
62 per cent for the social sciences course. Such responses are valuable as 
supplementary to other forms of evaluation. They cannot be taken, of course, 
as the sole means of appraisal.® 

The Dean of the College at Chicago likewise collected faculty judg- 
ments concerning the success of the New Plan. For the most part these are 
favorable and are expressions of satisfaction on what the courses are doing 
for both the faculty and students. This form of appraisal, however, comes 
close to the first method described. The chief difference lies in the fact 
that the number of judgments is greater. 

Although not conducted as an appraisal of its current program, the 
adult follow-up study conducted by C. Robert Pace for the General College 
at the University of Minnesota is another example of this type. Faculty 


" Zapffe, Fred C. ‘The Relation Between General Education and Medicai Educa- 
tion.” In General Education in the American College, 38th Yearbook, Part II, of the 
National Society for the Study of Education, 1939. P. 225. 

*Eurich, Alvin C. “A Study of Subsequent Academic Achievement of General 
College Transfer Students.” Chapter XIII in The Effective General College Curricu- 
lum as Revealed by Examinations. Minneapolis: University of Minnesota Press, 1937 
* Boucher and Brumbaugh. O?. cit., p. 141. 
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members cooperated in constructing a 52-page questionnaire by supplying 
answers to this question: “In terms of what you are now teaching or would 
like to teach, what information would you like to have about young adults” ; 
This questionnaire was then sent to 1600 former students and graduates . 
from whom slightly less than 1000 usuable returns were received. The de. J 
tailed analysis gives a discouraging picture on the effectiveness of the gen I 
eral education provided for these young adults during the period they were 
in college. 


As reported by Pace, the major generalizations from the study can be rece 
summarized as follows: : refe 


On Individual Life 


1. By and large, the leisure-time of most of these young people is 
spent in reading, talking, and such relatively nonproductive pursuits tor 
as listening to the radio and going to the movies: they seek, in other Sec 
words, passive spectator entertainment rather than active and creativ: 
outlets for their energies. ; 

2. A substantial minority of young adults give evidence of personality 
difficulties and poor emotional adjustment. 

3. By and large, the life goals of these young adults seem to be chiefly . to 
tied up with financial rewards. pla 


Home Life du 


1. Although many adults expressed a desire for more information . the 
about ways to economize, many were also engaged in uneconomica! for 
practices. cu 

2. Many of these young people have set for themselves extremely high to 
economic standards as a prerequisite for marriage. ac 

3. A large majority who have children wish they had more informa- fa 
tion about various problems arising in connection with child care tic 


Vocational Experiences 
1. Many of the activities required in the jobs these people hold are 
ones for which they receive little or no training in school. 
2. The satisfactions which they get from their jobs seem to be highly in 
associated with their income and with their general adjustment and ca 
morale. 


Socio-Civic Life 
1. By and large, these adults apparently fail to appreciate or under- st 
stand the interrelationship among problems, the fundamentally as- 


sociational nature of modern society. This generalization is supported 
by the following facts: of 
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a. The attitudes of these adults toward fundamentally related social 
issues are markedly inconsistent. 

b. While most of them exhibit a genuine interest in broad national 
and international problems, very few of them participate in any 
of the political processes (except voting) through which public 
opinion is expressed. 

c. They exhibit a general lack of participation and interest in local 
and community affairs and problems.” 


Comprehensive Over-All Appraisals. The more extensive evaluations of 
recent years have combined several of the methods described above. Brief 
references to three such studies follow: 

Following the extensive investigation of 59 institutions made by the 
Committee on Revision of Standards set up by the Commission on Institu- 
tions of Higher Education of the North Central Association of Colleges and 
Secondary Schools, a criterion of instructional effectiveness was formulated 
that was included in the new Statement of Policy. It reads: 


“An institution will be expected to show a sympathetic concern for 
the quality of instruction offered students and to give evidence of efforts 
to make instruction effective. Consideration will be given to the emphasis 
placed by the institution upon teaching competence in the selection and 
promotion of teachers, to the manner in which young instructors are in- 
ducted into teaching activities, to the aids that are provided as stimuli to 
the growth of individual members of the staff, to the institution's concern 
for high scholarship in students, to its emphasis upon the adjustment of the 
curriculum and teaching procedures to the abilities and interests of students, 
to efforts to make such examinations as are given more reliable and more 
accurate measures of student accomplishment, and to the alertness of the 
faculty to the instructional needs of students. Familiarity of the administra- 
tion and faculty with current discussions of instructional problems at the 
college level and with recent experimental studies of college problems are 
further evidences of institutional alertness to the need for seed wuhing "= 


On the basis of this criterion the Association now evaluates instruction 
in general education as part of the accrediting procedures under five separate 
categories: (1) administrative concern for good instruction; (2) attitude 


* Pace, C. Robert. “A Cooperative Faculty Interpretation of the Adult Study,” in 
Curriculum Making in the General College, a mimeographed report issued by the 
staff of the General College, University of Minnesota, 1940. 

See also Pace, C. Robert. They Went to College. Minneapolis: University of 
Minnesota Press, 1941. 

™ Haggerty, Melvin E. The Educational Program. Volume 3 of The Evaluation 
of Higher Institutions. Chicago: University of Chicago Press, 1937. Pp. 257-8. 
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toward student scholarship; (3) practices in adjustment of curriculum and 
instruction; (4) attitudes toward marks and examinations; and (5) alert. 
ness of faculty to instructional needs. 


A report of a comprehensive study somewhat different in nature is now 
in press at the University of Minnesota. Moving on from the special effort 
to develop comprehensive examinations in line with the objectives of the 
program of general education, Ruth Eckert’? has directed a special two- 
year evaluation of the total program. Working cooperatively with the staff, 
she attempted to make explicit the goals of the total General College pro 
gram, to obtain a clearer description of the students, to discover the specific 
character of the changes that occur in young people as they progress in the 
General College, to canvass the attitudes of students and faculty toward the 
program, and to study the post-school careers of students—both those who 
withdrew from college and those who graduated. 


As in other comprehensive evaluations of general education, one of her 
most baffling obstacles was the lack of appropriate measuring instruments 
Practically every program of general education is much broader in scope and 
purposes than the most comprehensive evaluation that can be devised. This 
is desirable up to a point and necessary in pioneering educational efforts 
If, however, the gap between the instructional program and the appraisal 
becomes too great the faculty are likely either to fit the program to the 
methods of appraisal and thereby impose upon it the same limitations that 
are found in the present instruments or to give up any system of appraisal 
because none can be devised that will provide evidence of student develop- 
ment in relation to all the objectives. Either alternative is unfortunate. The 
best that can be done, therefore, is to make the evaluation as comprehensive 
as ingenuity and resources permit. Such were the efforts at Minnesota. 


In summarizing the results from this study, Ruth Eckert writes as 
follows: 


“No final conclusions concerning the values of the General College 
program can be drawn until further analyses are completed. But the present 
findings give a general impression of the successes and failures of the pro- 
gram. Evidently a college which has established courses in contemporary 
problems can definitely assist students in attaining clearer social concepts, in 


” Eckert, Ruth. “Preliminary Report on the General College Appraisal.’ Unpub 
lished manuscript made available by the author who is now preparing the final report 
for publication 
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reaching a better understanding of vocational, sociological and political 
issues, and in becoming more intimately acquainted with contemporary «3 
penings. The majority of students are conscious of these values and would 
definitely recommend the College to a young person of similar interests and 
abilities. They are even more unanimous in t eir endorsement of the coun- 
seling received on a great many problems, and the studies made to date sug- 
cest that from such advisement has come a more sober, thoroughly realistic 
planning for the future. 

Less seems to be accomplished in developing more liberal social at- 
titudes or better personal adjustment. Despite very substantial gains in in- 
formation, young people appear no more liberal or socially conscious at the 
conclusion of their study than at its beginning. Perhaps a year or two of 
even the most functional program would not result in a fundamental re- 
orientation to such problems. Likewise actual participation does not appear 
to be very significantly influenced by residence in the College. Finally, and 
perhaps most disquieting of all the facts revealed by these first analyses, is 
the inability to distinguish, on the basis of their out-of-school problems, 
sensed needs, social attitudes, or actual participation, between ll ove who 
dropped out of a traditional program and those who completed a two-year 
general education program. Seemingly there exist dynamics of action as yet 
intouched by even a highly progressive educational program. 

“Any college that seriously studies its own problems may expect results 
not very different from those just presented. That measured outcomes fall 
far below expectation must naturally be a cause for concern, especially when 
the competencies investigated are deemed significant for youth’s long out- 
of-school years. But the General College, and any other college that has 
closely studied its program, has one tremendous asset in that faculty mem- 
bers have themselves discovered present strengths and weaknesses and can 
therefore constructively and realistically plan for its future development. 
Most of the educational imagination, ingenuity, and resources in American 
colleges have been bent to the task of educating very able students. It may 
take a long period of experimentation before we learn how to broaden more 
effectively the viewpoints, to open new vistas of interest, and to incline more 
fully toward social goals of the barely average high school graduate.’ 


Another example of a comprehensive evaluation study of general educa 
tion is that conducted cooperatively by trustees, faculty and students at 
Bennington College.’* At the outset, these assumptions were agreed upon 


as a basis for carrying on the study: 


lbid., pp. 28-9 
_ ™“Eurich, Alvin C. Evaluation Study. Bennington College Bulletin, News of the 
Year 1939-40, VIII (June, 1940). 
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1. Evaluation of any program implies a set of values by which judg- 
ments are made. 


™m 


The values of an educational institution are expressed in the instruc. 
tional program. 


3. Bennington College exists essentially for the contribution it can 
make to the development of students. The educational values must 
therefore be interpreted in terms of changes in students the program 
is designed to bring about. 


|. With the desired changes in students agreed upon, the chief task of 
the evaluation study becomes that of collecting and summarizing t! 


evidence that will show the extent to which the changes are taking 
place or the degree to which the objectives are attained. 


5. Because Bennington attempts to establish habits of learning that will 
function throughout life, it is essential that the evaluation study be 
as much concerned with graduates and former students as with the 
present student group. 


6. An evaluation program conducted in terms of these assumptions 
must be a cooperative undertaking of the entire Bennington com- 
munity. 

Following this agreement, faculty, students and trustees submitted ques 
tions for which they wanted answers. These in turn were rated by the 
faculty as to relative importance. During the past two years the evaluation 
staff has been collecting and summarizing descriptive data on the total student 
group and making an intensive study of the educational program in terms 
of: (1) basic assumptions and aims, (2) the range of studies taken by 
students under a system which makes no prescriptions, (3) the experiences 
beyond courses such as those provided during a winter period when the 
students are away from the campus and (4) a detailed analysis of what 
students do with their time when so little of it is spent in formal classes 

The evaluation as such has been concerned with the following major 
questions: Do students follow their interests? Do students know and un 
derstand their fields of study? What person-to-person relationships are de 
veloped to_stimulate learning? What attitudes do students develop? What 
appraisal ts made of the program by other colleges and universities and bj 
employers? What appraisal is made of the college program by students 
former students, graduates and by the faculty? The evidence answering 


these questions is now being brought together in a report of progress. The 


analysis reveals, among other generalizations, that students working in a 
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system of free electives with guidance develop well-balanced programs of 
ceneral education; that even though they spend on the average only six 
hours a week in class, they devote as much, if not more, time to academic 
ictivities as students at an old and well established woman's college; that 
students in general do build programs in line with their measured interests ; 
that in most fields they make higher scores on subject matter tests than 
students generally and that their achievement on such tests ts in line with 
their measured academic aptitudes which is relatively high; that students, 
former students and graduates are enthusiastic about the methods followed ; . 
ind that the students are successful in terms of judgments of other colleges 
or of employers. There are criticisms and deficiencies, to be sure, but these 
are of a relatively minor nature. For the most part the evidence justifies 
the assumptions under which the college is operating. 

Within the next few years new approaches to the evaluation of general 
education as offered by colleges and universities wiil undoubtedly be ad 
vanced. Never before have colleges been so much concerned with systematic 
appraisals of their work. For example, the 22 colleges taking part in the 
Cooperative Study in General Education under the sponsorship of the 
American Council on Education, included 10 general and 32 special prob- 
lems of evaluation among those they originally proposed.’® Many of these 
studies are now in process under the direction of Ralph W. Tyler and his 
staff. With the general approach much the same as for the Eight-Year 
Study of the Progressive Education Association, new instruments and new 


means of observation should evolve. 


CONCLUSIONS 


Although not exhaustive, this summary of evaluation practices of gen 
eral education in colleges shows that they range all the way from simple 
reiterations of a belief in a program to elaborate and extensive investiga 
tions of the major aspects and contributions of the instructional patterns. In 
general, but not universally true, the more extensive the study, the more 
thorough is the appraisal. Out of these widely scattered evaluation programs 
there are now emerging a number of conclusions. Primary among these are 
the following: 

“The Study of Needs as One Basis for Determining Curriculum Objectives and 


Content." Mimeographed report issued by the Cooperative Study in General Educa 
tion, 6010 Dorchester Avenue, Chicago, Illinois 
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. The general education a student acquires is not determined wholly 


or even to a major degree by the particular pattern of courses he 
pursues, by the number of units or credits he accumulates, or by the 
years he attends college. 


. Students subjected to experimental programs of general education 


do as well, if not better, in subsequent academic work as students 
who pursued the usual liberal arts program. Follow-up studies have 
not been sufficiently extensive to indicate whether the products of 
the experimental programs are more effective, as intended, in living 
as members of a democratic society. 


. Students working in a system with no prescribed courses, who, in 


consultation with faculty members develop ——— of general edu- 
cation adapted to their individual needs and interests, acquire a gen- 
eral education as measured by subject matter tests that is as thorough 
as that acquired by students who follow rigidly set patterns of 
courses. 


. The results from follow-up studies of former college students and 


graduates are discouraging and leave much to be desired by way of 
effective living in a democracy. 


. No pattern of general education has as yet emerged that is generally 


applicable at all institutions. On the contrary, most enthusiasm for 
the programs can be found at institutions that are vigorously attack- 
ing the problem of developing more adequate programs for the 
students they serve. 

An extensive program of evaluation, if cooperatively developed, 
tends to stimulate the staff to seek more effective curricula and in- 
structional methods. This, perhaps, is the major outcome of most of 
the evaluations of general education in college. 
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TECHNIQUES FOR MEASURING NEWER VALUES IN EDUCATION 


J. WAYNE WRIGHTSTONE 


Assistant Director Bureau of Reference, Research and Statistics, 
New York City Schools 


Editor's note: One of the very worthwhile contributions of the evalua- 
tion movement to education has been its contribution to the measurement of 
newer values and outcomes. The author describes some recent advances in 


this area. 

Newer Values in Education Are Emerging From Today's Curriculum. 
Two decades ago formal evaluation of the curriculum was limited to the 
measurement of skills in oral and silent reading, writing and spelling and 
to the acquisition of certain items of information and concepts in the 
subjects which comprised the curriculum. In recent years, however, both 
the elementary and secondary school curricula have been revised to include 
more comprehensive objectives of instruction. To the mastery of functiona! 
skills and information have been added such newer objectives as growth in 
desirable attitudes, interests, powers of critical thinking, work-study skills 
and personal-social adaptability. This modification of curricular objectives 
has required a corresponding change in techniques of evaluation. 

Newer Techniques Are Being Devised to Measure the Newer Values. 
In order to appraise newer values in modern education, newer techniques are 
needed to gather evidence of achievement and growth of pupils. Attitude 
scales have been devised to measure the intensity of opinions and beliefs on 
various issues and topics. The measurement of interests has required the 
development of interest inventories, of pupil logs or diaries for reading and 
other activities, and of checklists. The measurement of various aspects of 
critical thinking has made necessary the development of new types of exer- 
cises in objective test form so that indexes of achievement might be obtained 
for ability to organize facts, ability to interpret data, and ability to apply 
principles or generalizations to new situations. In the appraisal of personal- 
social adaptability, for instance, the newer techniques include the develop- 
ment of self-descriptive inventories, improved judgment rating scales, con- 
trolled-observation and time-sampling methods and anecdotal records. Thus 
the concern for newer values in education has led to the construction of 
newer techniques of measurement and appraisal. 

In the sections which follow techniques for measuring newer values in 
education are listed and described to the extent that limited space permits. 
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By citing some of the newer techniques of measurement, it is hoped that 
the reader who is interested in any specific test, scale or device will be able 
to obtain actual copies to study in detail. No attempt has been made to lis: 
all of the newer techniques and instruments. Those which are mentioned 
ire for purposes of illustration. 


Newer techniques for Measuring Attitudes. Several types of scales have 
been constructed to measure or to estimate growth of attitudes. These scales 
represent a means of obtaining a pupil's opinion about certain social, scientific 
or aesthetic issues by checking agreement or disagrsement with a given set 
of statements or opinions. These are measures of beliefs, or at least intel- 
lectually accepted opinions, but not necessarily valid measure of overt be 
havior. Rapport is a very vital factor in administering attitude scales so 
that pupils will give honest expressions of their own opinions of agreement 
or disagreement. 


When defined as expressions of opinions, attitudes have been measured 

more or less adequately by Thurstone’s opinion scales on war, the church, 

God, the Negro, the Japanese, the Chinese and other phenomena related to 

topics in the social field. These scales employ an equal-appearing interval 

tor weighting the intensity of a statement for or against some object or 

ideas. Employing the scaling technique used by Thurstone, H. H. Remmers 

has collaborated with others to construct generalized scales of attitudes 

These scales have four or five columns on the left-hand side of the page 

At the top of each column may be written the objects, persons, ideas, or 

phenomena toward which an expression of attitude is requested. The in 

dividual is asked to check statements which may be scored in terms of 

numerical values assigned by judges and showing the relative degree of 

intensity of attitudes. The scale for measuring attitudes toward any institu 

tion, for example, has such statements as: ‘‘Is perfect in every way;" ‘Serves 

society as a whole;” “Is entirely unnecessary ;"’ and “Does more harm than 

vood.”’ Each statement has a numerical weight somewhere between 0 and 12 

The individual checks each statement with which he agrees, and his score 

is the median value of the items he has checked. This series of statements 
usually numbers about forty-five items for each scale. 


‘ Published by University of Chicago Press, Chicago, Illinois. 
Remmers, H. H. Studies in Attitudes. Lafayette, Indiana: Division of Educa 
ial Reference, Purdue University 
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Using a different technique, Wrightstone has constructed a generalized 
measure of social attitudes toward racial, national, and international ideas 
and phenomena. This Scale of Civic Beliefs* consists of such statements as 
The Japanese are a sly and crafty race’ and “Labor unions have accom. 
plished much good.” The pupil agrees, disagrees, or is undecided in his 
response to each item. An alternate form of the scale with items of Form A 
and Form B matched permits a consistency score or index. The same tech 
nique has been used by the Evaluation Staff of the Progressive Education 
Association in the construction of attitude scales in the social sciences anid 
the natural sciences.* 

Newer Techniques for Measuring Interests. Interests have long been 
considered as one of the fundamental factors in motivating the acquisition 
of functional information, skills, appreciations, and discriminations. Inte: 
ests may perhaps be defined, for purposes of this discussion, as those drives 
which lead the individual to various preferences in his activities and conduct 

A questionnaire method of measuring interests has been devised by the 
Evaluation Staff of the Eight-Year Study of the Progressive Education As- 
sociation. It may be illustrated by some sample items from the P. E. A 
Interest Inventory. The student is asked to check each item in the question 
naire as follows: 


L means like; I means indifferent to; D means dislike; X means had experience 
O means want some opportunity or more opportunities: 


L I D xX O 

(a) To listen to radio news commentators __. ( ) ( ) 
(b) To read about the habits of particular ani- 
mals or peculiar characteristics of certain 

(c) To read about labor questions ( ) 

(e) To act as “manager” of an athletic team ( ) ( ) ( ) (€ ) € ) 


Another technique involves diaries, logs or journals which students or 
teachers keep in a cumulative fashion. This technique may be illustrated by 
the Reading Records formulated by the Evaluation Staff of the Progressive 
Education Association. After the pupil has made a cumulative log of his 
reading of books, newspapers and magazines, the Progressive Education 


* Published by the World Book Company, Yonkers, New York 
‘For further information write to: Evaluation Staff of the Eight-Year Study 
Progressive Education Association, University of Chicago, Chicago, Illinois 
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Association Reading Record is scored so that each entry is assessed in ac 
cordance with a predetermined scale of values, set up by a jury, and by a 
special formula to denote the maturity of the reading level of the book, 
magazine, or newspaper article recorded. Thus it is possible to obtain indexes 
both of the average maturity and of the range of interests. Adaptations of 
this technique may be applied to other types of educational experiences. 


Newer techniques for Measuring Critical Thinking. An objective to 
which almost any subject area subscribes is development of powers of critical 
thinking. This has become a prominent objective of the natural and social 
sciences. From the work that has been done both in the curriculum and in 
evaluation several convenient aspects of thinking may be tested by prepared 
scales. They are (1) the interpretation of data, (2) the application of prin- 
ciples and generalizations to new situations, and (3) recognizing the logic 
of an argument or the nature of proof used in materials presented in the 
curriculum. 


Ability to draw conclusions or to make inferences from facts and ma 
terials in the social and natural sciences becomes increasingly important in 
the preparation of students for everyday living. To be sure, the interpreta- 
tions of data and the application of principles have been tested more or less 
incidentally and often in a haphazard manner by essay examinations. If 
newer instructional practices are to introduce the objective of critical thinking 
seriously into the school curriculum, valid and practical methods of ap 
praisal must be devised and made available. At the elementary school level 
a Test of Critical Thinking in the Social Studies® is available. This test is 
divided into three parts. Part I measures abilities to obtain facts from graphs, 
maps, references, newspapers, and magazines. Part II measures abilities to 
draw xeasonable conclusions from given facts. Part III measures abilities to 
apply generalizations to new situations. At the secondary school level a 
measure of aspects of thinking is the Cooperative Test of Social Studies 
Abilities. The high school pupil is presented with sets of facts in narrative, 
graphic, or tabular form in a battery of subtests entitled obtaining facts, 
Organizing facts, interpreting facts and applying generalizations in the 
social studies. 


* Published by Bureau of Publications, Teachers College, Columbia University, 
New York City 
* Published by Cooperative Test Service, New York City 
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Various tests and techniques of major aspects of critical thinking have 
been developed by the Evaluation Staff of the Progressive Education Asso- 
ciation at the University of Chicago. These include a series of tests entitled 
Application of Principles in Science which measure ability to recognize when 
and how science principles may be applied in problem situations new to the 
pupil. Similar series has been devised for the social studies entitled Social 
Problems. These tests are constructed by providing a paragraph, graph, or 
table to study the problem. This problem is followed by principles and 
generalizations some of which apply to the problem. The pupil is asked to 
indicate the correct principles and generalizations. In addition to the correct 
principles and generalizations are some which contain errors in reasoning 
such as true but irrelevant statements, false analogies, appeal to false au- 
thority and popular misconceptions of the truth or the popular misconceptions 
of the facts. 


Another series of tests entitled Interpretation of Data have been devised 
by the Evaluation Staff of the Progressive Education Association. These tests 
are used to estimate the ability of high school pupils to judge the sound- 
ness of interpretations which are presented in the form of statements. These 
statements or interpretations follow data which are adapted from the fields 
of science and the social studies and are present in paragraphs, tables, charts 
and graphs. These statements are to be marked as true, probably true, prob- 
ably false, false, or insufficient data. From the test results it is possible to 
find out whether a student has a tendency to be over-cautious, a tendency to 
go beyond the facts, or a tendency to evaluate carefully the interpretations 
based upon the data which are given. 


Another test developed by the Evaluation Staff of the Progressive Edu- 
cation Association is entitled Nature of Proof. This test is designed to 
measure the pupil's application of principles of logical reasoning in the 
fields of sciences, social studies and mathematics. The test exercises are 
situations adapted from materials in newspapers, magazines and books. The 
students are asked to react to the relevancy of the assumptions which under- 
lie arguments or opinions. They are asked to identify certain major and 
minor premises that must be accepted if certain stated conclusions are to 
follow. They are asked to discover crucial words and phrases which must 
be precisely defined so that a changed definition will not produce a changed 
conclusion. They are asked to react to logical argument which cannot be 
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disproved by ridiculing the person making the argument or by attacking 
his motives. 

Newer Techniques for Measuring Work-Study Skills. Work-study skills 
involve skills necessary for independent study and are usually identified 
with the ability to read maps, graphs, charts, and tables, to use the table of 
contents and the index of a book, and to find items of information in refer 
ence books. In addition, elementary as well as secondary schools are placing 
an increasing emphasis upon effective use of the school and local libraries 
and this involves such skills as knowing the effective use of library privileges, 
the techniques of withdrawing and returning books, the numbering or filing 
system of the books, and so on. 


At the elementary and junior high school levels, the most comprehensive 
tests of work-study skills now available are the Iowa Every-Pupil Tests of 
Basic Study Skills.? Tests on the use of the library at the elementary and 
secondary school levels are available in the Peabody Library Informatio 
Tests.* Reed has prepared a Test on the Use of the Library for Higl 
Schools*® which is especially adapted for the pupils of the upper grades ot 
high school. 

Newer Techniques for Measuring Personal-Social Adaptability. A 
newer educational value in the modern curriculum calls for evaluating the 
personal-social adaptability, or so-called personality factors, of children. The 
process of appraisal of personal and social adjustment has used a variety 
of methods. These range from the free association methods, self-descriptive 
questionnaires and psychoneurotic inventories to rating scales, anecdotal 
records and behavior descriptions, including the case study methods. 

For children below the fourth grade, a widely used technique is the 
so-called “time sampling,”” or controlled-observation method. In this, the 
behavior to be measured is defined in terms of overt and observable activi- 
ties, and specially trained observers record the occurrence of the defined 
activities, during a specified time interval over a period of days, weeks, or 
months. Thus it is possible to obtain an index of an individual's behavior 


* Published by Houghton Mifflin Co., New York City. 
* Published by the Educational Test Bureau, Minneapolis, Minnesota. 


* Distributed by Evaluation Staff of the Progressive Education Association, Uni 
versity of Chicago, Chicago, Illinois 
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normal school situations. Pistor’® and Wrightstone,'' using this method 
appraised such qualities as initiative, cooperation, work-spirit, and depend- 
bility. Olson and Cunningham* have compiled an extensive bibliography 
of the observational techniques constructed to measure many personal and 
social qualities. 

Of the more practical value is the rating scale. Several of these have 
appeared at the elementary school level, including the Haggerty—Olson- 
Wickman Behavior Rating Scale,’* and the Winnetka Scale for Rating School 
Behavior and Attitudes."* In self-descriptive scales at the elementary school 
level, the oldest and best known scale is the Woodworth—Mathews—Personal 
Data Sheet.*° At the secondary school level are the Bernreuter Personality 
Inventory’® and other similar scales. 


Records and Reports in Evaluation. Records which bring together and 
summarize evaluation data are as essential as the data themselves. This in- 
volves the use of an adequate cumulative record card. One of the most com- 
prehensive and best cumulative records is the one devised and used in the 
secondary schools of Denver, Colorado, Public Schools. It can easily be 
adapted for elementary as well as secondary school pupils. Another excellent 
cumulative record card is the one devised and distributed by the American 
Council on Education, Washington, D. C. By the use of such cumulative 
records test data and observational records can easily be organized into a 
portrait or sketch of the individual's direction and rate of growth. 

Each pupil may keep in an individual folder a record of (1) things 
he has done or managed by himself; (2) things he has investigated by him- 
self; (3) books, articles, or materials he or she has obtained and read 
more or less through individual initiative. For the individual-pupil folder 
the teacher may file individual health record and work reports. Inserted in 


” Pistor, Frederick. “Evaluating Newer School Practices by the Observational 
Record,” National Elementary Principal, XVI (July, 1937), 377-89. 

" Wrightstone, J. Wayne. “Constructing and Observational Technique,” Teachers 
( ollege Record, XXXIV (October, 1935), 1-9. 

Oison, Willard C. and Cunningham, Elizabeth M. ‘‘Time-Sampling Techniques.” 
Child Development, V (March, 1934), 41-58. 

Published by World Book Company, Yonkers, New York. 

“ Published by Winnetka Educational Press, Winnetka, Illinois 

* Published by C. H. Stoelting Company, Chicago, Illinois 

* Published by Stanford University Press, Stanford, California 
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this folder, also, would be notes on personality development which have 
been entered by the teacher and a test record card which includes the 
individual results of standardized tests. 


SUMMARY 


Newer practices in elementary and secondary education have required 
the development of newer technique for gathering objective evidence. Steps 
in the newer process of evaluation are, first, to formulate a comprehensive 
range of curricular objectives which will include not only acquisition of 
information and skills but also evidence of growth in interests, attitudes, 
appreciations, critical thinking, and social behavior. A second step is to 
define each objective in terms of pupil behavior. A third step is to find 
ready-made tests and techniques or to devise new formal and informal 
methods for gathering and appraising evidence of growth in each objective 
thus defined. These newer methods are illustrated by tests, inventories, rating 
scales, the anecdotal records, and controlled-observation techniques. A fourth 
step is to interpret and to apply the evidence thus gathered. 


In order to interpret evaluation data most wisely the fragments of 
evidence collected about the pupil should be correlated and integrated into 
a portrait of the individual by means of appropriate cumulative records and 
reports. The relationships among various aspects of pupil growth should be 
explicitly shown in the portrait. Only when these steps are carried through 
is it possible to use effectively the techniques for measuring newer values 
in education. 
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THE CONCEPT OF ORGANISMIC AGE* 


WILLARD C. OLSON and ByRON O. HUGHES 
University of Michigan 


Editor's note: Shifting emphasis in biology and psychology have brought 
many new concepts and practices to education. The author explains a method 

of compositing growth values. 

THE expression of test results in age units has been a common technique 
in the field of measurements. The same concept is readily applied to any 
factor that changes with age such as height, weight, dentition, strength, in- 
terest, etc., etc. The writers have employed age units in an attempt to secure 
a more comprehensive description of a child developing through the years. 
After constructing patterns of growth curves for large numbers of children 
it became apparent that it would be of interest in testing hypotheses or 
children as wholes to study the center of gravity of growth systems and the 
relation of separate aspects of growth to the whole. To do this, growth 
values as of a given chronological age for each child were averaged and 
given the name “‘organismic age.” The method and possible significance of 
the concept can be clarified by the example in Table I. The data describe 
a child on his seventh birthday as read from his curves of development. 


TABLE | 
GROWTH VALUES FOR A SEVEN YEAR OLD Boy 


8 months 


84 months 


* A phase of a longitudinal, multidiscipline study of growth in the Child De- 
velopment Laboratories of the University Elementary School of the University of 
Michigan. For more detailed accounts see Bibliography at the close of the article. 

In the illustration the high value of 106 months for Mental Age is 
22 months higher than the child’s Chronological Age. In declining order 
we then have measures representing growth in dentition, reading, weight, 
height, ossification of the hand and wrist, and strength of grip. An av- 
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erage of these ages is 82 months and is labelled Organismic Age. In 
average maturity he is thus 2 months below his Chronological Age. The 
value of the more complete diagnosis for the teacher may be illustrated 
by considering his Reading Age. This is 26 months below his Mental Age 
and 4 months below his Chronological Age. In terms of a purely intellectual 
interpretation there would be an inclination to call him a case of “reading 
disability.” Actually, however, growth in reading is within the matrix 
covered by most of the curves and is only 2 months less than his Organismic 
Age. The degree of homogeneity of the pattern of growth about the central 
tendency is of interest in studying growth problems. In the illustration the 
average deviation of the measures from the mean is about 8 months. The 
factors affecting variability will be the subject of future reports. 


The question is asked frequently, “How many measurements and which 
ones are needed to determine an adequate organismic age?” Unfortunately 
there is no easy answer. If a person takes but one measure he has some idea 
about the growth of the child, if he takes two he has a more complete 
description, if he takes three he has a still better one, etc. Theoretically one 
would have determined a stable organismic age when no further addition 
of values would cause it to fluctuate in a significant manner. The generaliza 
tion that achievement tends to be an expression of total growth has occa 
sionally been challenged by the worker who has collected an additional 
physical measure or two such as height and weight. Experience reveals that 
the gap tends to be filled in as one gathers additional and sometimes more 
subtle measures of growth. 


Theoretically the measures taken should represent an inclusive theory 
of the organism. A complete account might include measures of emotionality, 
social adjustment, gross bodily development, circulation, efficiency of sense 
organs, development of educational and physical skills, and measures of 
metabolic function. Practically, the work to date has been more limited. 


The trend described by growth in organismic age tends to be rather 
stable for the periods of life thus far studied intensively. For example, the 
child in Table I had an organismic quotient (O. A. — C. A.) of 98 at age 
seven and his complete record yields a value of 101 at age five, 102 at 
age six, 99 at age eight, 98 at age nine, 102 at age ten, and 106 at age 
eleven. In other words, the annual increment of total growth as based upon 
Organismic age has a tendency to remain proportional from year ‘0 year 
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[he predictability appears to continue into the adolescent period. It has been 
pointed out that organismic age has some of the stability that occurs in 
eliminating fluctuating errors in averaging a series of determinations. How- 
ever, from the point of view of growth theory, it is also interesting to note 
the hypothesis that compensatory fluctuations may occur which relate some 
how to the energy available to the organism. 

The concept of Organismic Age has been serviceable in various re- 
searches on growth and applications to individual study in the University 
Elementary School. The range of meanings and applications will be reported 
more completely in a monograph in preparation. 
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solution is suggested as a procedure worthy of extensive trial and experimentation 


SELF-EVALUATION IN TEACHER EDUCATION* 


Maurice E. TROYER 
Associate in Evaluation 100, 
Commission on Teacher Education this 
Editor's note: One of the pressing needs of teacher education has been cen 
more systematic evaluation of one sort or another. The author presents new 
materials on student self-evaluation 
large 
THE PROBLEM In o 
THE need for democratizing the evaluative process may be revealed 70 | 
through examination of evaluative practices in the light of some fundamental take 
democratic values. j mun 
“The defense of democracy is far more than an economic question. In “PP 
the last analysis, it is a moral and spiritual question—a question of the alt 
values and ideas to be defined and applied to life (4, p. 50) . . . The tun 
survival of democracy in the world depends on the vigor and strength of on | 
democratic loyalties among the peoples of the earth (4, p. 51) . . . First, und 
the free man is loyal to himself as a human being of dignity and worth. oil 
The obligation of the school here is to give each pupil a deep feeling of < 
competence, adequacy, and security, to bring each individual under its care - 
to maturity and freedom (4, p. 56) . . . Second, the free man is loyal to less 
the principle of human equality in brotherhood. He treats his neighbor, as tota 
well as himself, as a human being of dignity and worth. In its effort to The 
develop this loyalty in the young the school should first of all arrange its ‘ee 
whole life in harmony with the principle (4, p. 57) .. . The first re- ; 
sponsibility of the teacher is to maintain a steadfast and informed loyalty to be 
the values and processes of democracy, to the several articles of the demo- by 
cratic faith, to the interests of children and the cause of human freedom eva 
In the work of the school and in the life of the community he should teat 
exemplify the spirit of democracy. He should struggle without ceasing to 2 
apply the articles of this great faith to both education and society. He should 
be among the first to sense violations of the principles of democracy, to 
apply these principles to neglected fields, to keep alight the lamps of reason, ' an 
to champion the interests of the underprivileged and downtrodden (4, es 
109) . . . A fourth responstbility of democratic government is to safe- d 
guard the integrity of the teacher and to encourage him to grow to his full - 
stature (4, p. 107). 
the 
* This paper is necessarily a theoretical presentation, rather than a report of bri 
experimentation. Evidence as to the nature and seriousness of the problem is derived 
from a comparative analysis of some commonly accepted characteristics of democratic mi 
processes and of conventional procedures in evaluation. Self-evaluation as a partial te: 
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No one knows just how widely these characteristics of democracy, 

quoted from a recent pronouncement of the Educational Policies Commis- 

sion, are accepted by the educational profession. But, for the purposes of 

his paper, it is assumed that these considered statements represent sound, 
democratic philosophy. 

Observation of educational practices in schools and colleges, by and 

) large, reveals many disparities between professed ideals and actual behavior. 

: in many instances, students are provided opportunity to govern themselves 

.o long as they encounter no real issues, then administrators or teachers 

take over; teachers participate in policy formation so long as no real com- 


ntal 

munity issues arise, then the school board or superintendent steps in. Few 
Ir appropriate experiences are provided for students in professional education 
the to develop an understanding of the ways of democracy—few real oppor- 
The tunities to identify and develop their own purposes, to work cooperatively 
ot on real problems, to plan and carry out an attack on real issues, to gain an 
oh understanding of the place of leadership and the utilization of resources, 
of and to know when democracy demands group methods or when individual 
are decision and action are in order. Typically, students in our public and pro- 
to fessional schools follow plans laid down by others with respect both to the 
as total instructional program and to the activities within individual courses. 
se They take tests chosen or developed by their instructors, administered e7 
ae masse for convenience. The results are interpreted for them. They are told 
to what their next steps should be. Those who become teachers are evaluated 
o- by supervisors or principals for purposes and on bases not clear to the 
x4 evaluated. Frequently, the basis for such appraisal is the success of the 
- teacher's students in a curriculum determined in a central office as revealed 
d by a testing program originated in another central office. 
0 To the extent that evaluative procedures and relationships in our public 
‘ and professional schools are generally practiced as described above, it can 
; hardly be said that evaluation will contribute to the loyalties on which 
i democracy is dependent. To what extent will such practices give pupils a 


genuine feeling of competence, adequacy, security, dignity, and worth? Are 
these evaluative procedures loyal to the principle of human equality and 
brotherhood? Do they tend to increase loyalty to ideals of honesty, fair- 
mindedness, and the scientific spirit? Do they safeguard the integrity of the 
teacher and free him to seek earnestly to grow to his full stature? In short, 
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do they promote loyalty to the values and processes of democracy on ths 
part of administrator, teacher, and student? 


DEFINITIONS 


Evaluation may be defined as the process of making those value judg. 
ments by which one determines his courses of action. For purposes of this 
paper, evaluation in teacher education is the process of making value judg 
ments in the acceptance, rejection, and clarification of identified professiona 
goals, working assumptions, appropriate experiences through which the goals 
may be realized, and in the assessing and appraising of evidences of progress 
toward the goals. Self-evaluation is the same process with the student serv- 
ing the dual role of evaluatee and evaluator. Instead of an evaluator, a per 
son superior by reason of his position, sitting in the saddle and sinking the 
spurs of test scores or observed behavioral evidence of evaluative significance 
into the evaluatee, the learner is in the saddle. The professor of education, 
supervisor, teacher, or evaluation expert, as the case may be, is a resource 
person whose responsibility is that of helping the learner to identify his own 
strengths and weaknesses and to plan accordingly. 

Most of the decisions in life situations are based on value judgments 
which we must make for ourselves. This is true for children in their out-of 
school hours, for teachers in their professional responsibilities, and for the 
life problems of adults generally. Written high among the goals of educa- 
tion should be one concerned with increased competence in identifying one's 
own strengths and weaknesses and in choosing and planning next steps ac 
cordingly. The problem of self-evaluation of teacher growth at the pre 
and in-service levels is especially important, for it is quite unlikely that chil- 
dren or adults generally will develop any great proficiency at self-evaluation 
unless teachers possess and practice this same skill. 

Self-evaluation in teacher education needs emphasis from two angles: 
first, as a process used by the undergraduate or the teacher-in-service toward 
the improvement of his own program of growth; and second, as the develop- 
ment of those insights and skills which will help the teacher look upon the 
evaluative relationship she has with boys and girls as a guidance relationship 


PROCESSES OF SELF-EVALUATION 


It is appropriate to explore two types of self-evaluative procedure 
(a) self-evaluation of the more subjective type with respect to those kinds 
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of goals and situations for which formal evaluative approaches are as yet 
insatisfactory, and (b) self-evaluative activities utilizing formally constructed 


evaluative instruments. 

It is quite difficult, if not impossible, for institutions with enrollments 
limited by quota at the freshman level to incorporate in their selection pro- 
eram very much of a guidance function, or to proceed on the basis of any 
substantial self-evaluation on the part of the applicant. A few high school 
euidance programs are now experimenting with exploratory experiences 
through which students may be led to make intelligent decisions concerning 
teaching as a professional choice. Initial selection in professional schools will 
become less necessary as these high school practices spread and improve. 

In certain teacher education institutions (7)! with an orientation course 
designed to fulfill a guidance and selective function, students are being 
given increased responsibility of an evaluative nature. Many guidance and 
personnel workers have recognized for some time that unless students come 
to see the value of the evidences with respect to their own potentialities 
and competencies and come to their own decisions in the light of these 
evidences, guidance is relatively ineffective (11) .? 

It seems highly desirable that the whole professional education program 
should carry, among other things, a continuously selective function. In such 
a procedure, the student plans with his adviser or instructor for observational 
and participatory experiences, reading, and discussions, through which teal 
goals in his general and professional education would begin to emerge for 
him. These experiences would tend to raise in the mind of the student 
such questions as the following: What understandings will I need in order 
to take my place as a leading citizen in a community? What understandings 


‘Stanford University, The College of William and Mary, Wayne University, 
Southern Illinois State Normal, Colorado State College of Education, and University 
of Nebraska are developing beginning courses in education which are assuming an 
increased guidance responsibility. There are others no doubt that have not come to 
the direct attention of the writer. 

*A very interesting and possibly enlightening follow-up study might well be 
made by personnel and guidance divisions in which two groups were selected for 
comparison: one would include those students who, in relationship with the coun- 
selors, gave every evidence of a clear interest in and understanding of themselves 
in relation to possible programs ahead of them, while in the other group were a 
similar number of people with equal potentialities, but who had not, for some reason 
or another, become sufficiently interested in themselves and in the future to pay 
any more than routine attention to evidences of their own strengths and weaknesses 


in relation to likely demands of the future 
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of our physical, biological and social environment, what cultural and 
spiritual appreciations, what social competencies will I need? Specifically as 
teacher, what shall I need to be able to do, what sort of person ought 
I to be? And inevitably, for those who would continue in professional educa. 
tion, the question: Where do I now stand with respect to these identified 
competencies of one who would be an effective citizen and teacher in a 
community? 

Through such experiences the student may conclude as follows: The 
best teachers 1 have seen know what is going on in the world. They seem 
to read widely, listen well, and contribute intelligently in the discussion of 
problems that have roots in many fields of knowledge. When the import of 
these competencies begins to have a real place in his thinking, it should not 
be difficult to lead him to ask with genuine curiosity: How well and widely 
do I read? Do I listen well? Can I express myself well? Do I participate 
freely in discussions of books, music, art, movies, sports, topics of social 
economic, political, scientific consequence? Or do I find myself sorely tempted 
to shun groups if I know certain of these topics are likely to be the center 
of discussion? When this point is reached, the student's adviser can help 
him in several ways. 

Perhaps the student could take a contemporary affairs test. If his 
curiosity is genuine, if he really wants to know, let him take the test, give 
him the key, let him score his performance and interpret the results, with 
such counsel as is necessary. Likely he will get more out of scoring the test 
than would a machine or his adviser. 

There is reason to believe that the student will be more willing to fac 
the test results for what they are worth if, in certain areas of the con 
temporary affairs test, he has had the satisfying feeling of extensive know! 
edge through a high proportion of items correctly answered; or, in certain 
other areas, rather complete deflation through the experience of checking 
and counting errors which show almost no familiarity with the field. He is 
the more willing to face the evidence for what it is worth because no teacher 
or machine stands between him and his knowledge of results. In scoring 
his own test, he may have re-examined most of the items in the light of the 
key. And, in some cases, he may have gone so far as to check the key 
against Original reports and magazines from which the test was constructed 
Such a procedure may, at first glance, appear to be overly time-consuming 
for the student, but it must be remembered that the teacher’s concern over 
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me-consumption has probably resulted from the frequent necessity of 

coring many tests in one evening. The student has to score only his own 

saper. And the worthwhileness of the test, for self-evaluation, depends upon 
he extent to which its results are desired and used by the student. 

Another means of approaching this same self-evaluative problem might 

e through a log or check of newspapers, magazines, and books read over a 
period of several weeks or a month. There is no necessity for the student to 
pad the log. He has seen certain goals toward which he must strive. He 
wants to know where he stands now, to know how he can improve his 
learning experiences, and what direction they should take if he is to face 
his first teaching job with the greatest possible degree of readiness. This 
log or self-survey can be analyzed with respect to the types and quality of 
naterials read (2, pp. 85-100). 

Let us assume that thestudent in the orientation course has followed 
such formal and informal procedures in thoroughly identifying his own 
strengths and weaknesses under the guidance of his instructor. He may de- 
cide that he has so many weaknesses, or such outstanding deficiencies that 
it would be better for him to redirect his educational effort. Or he may 
proceed to the next phase, developing in the light of emerging goals and 
identified needs a program of experiences that will constitute his curriculum 
of general and professional education. He will, of course, have the help 
of the staff as needed in this planning. 

If, in sizing up himself, his needs, his purposes, he is encouraged to 
proceed in the teaching profession, he will now look to the resources of the 
institution through which he may reasonably expect to achieve his goals and 
outgrow his weaknesses. He will now begin to come to courses with a real 
purpose. This will give new challenge to some of his instructors. It may 
prove embarrassing and disappointing to other instructors, and to the student 
as well. Staff members definitely committed to serving as resource people 
on students’ problems are both challenged and embarrassed more frequently 
than are instructors who keep the student busy on their own highly out- 
lined and predetermined problems. Staff members may find it necessary 
continuously to reconstruct their courses, to provide continued field experi- 
ences along with discussions, readings, and lectures. They may need to get 
together with students to plan individual and group programs. They may 
actually undertake to work with prospective teachers as the latter are urged 
in professional courses and books to work with their own pupils later on. 


No.7 
333 
In a 
The 
seem 
{ 
nh of 
t of 
not 
dely 
Date 
ted 
iter 
elp 
lis 
ive 
¥ 
ith 
est 
n 
in 
is 
y 
Sa 


534 JOURNAL OF EDUCATIONAL RESEARCH [Vol. 35,N 


And these instructors will share a concern with the student for ever-increasing 
comprehension of the significance of the goals that they have set up, for 
the emergence of new goals, and for the improvement of ability in self 
evaluation. 

Casual or somewhat incidental observations by students in the field 
are quite difficult for the student himself to evaluate. But if observation is 
instrumental to some responsibility the student is fulfilling, he will then 
find within the activity much of evaluative significance. A student who is 
observing in the classroom of a teacher in order to contribute to the growth 
record of boys and girls in that class with respect to those goals which d) 
not lend themselves to formal evaluative techniques will have much to 
examine in cooperation with supervisor and cooperating teacher that will 
have evaluative significance. This student's insight into human behavior 
will be challenged. Resources will be called for, their usefulness appraised 
Students with real problems tend to become critical, and to broaden their 
evaluative procedures. 

The student teacher will participate with cooperating teacher and super 
visor in setting up guides for use in recording his activities and analyzing 
and appraising his effectiveness. Inasmuch as student teaching is an activity 
which embodies his professional education and in which all previous prepara 
tion is potentially useful, the evaluative procedures should be broadly con 
ceived. Opportunity should arise for him to appraise his knowledge o 
things and processes, understandings of children, parents, and other teach 
ers, ability to plan with boys and girls and carry through plans. Problems 
in personal relationship, materials developed, and resources used may be 
analyzed and appraised first by the student and then validated in conference 
against the appraisals of cooperating teacher and supervisor. 

Eventually the student approaches the problem of placement. A principal 
or superintendent obviously has a right to know certain things about any 
applicant for a teaching position—his physical and mental health, emotional! 
stability, social competence, academic achievement, leadership qualities, inter 
est in and understanding of every-day occurrences in the fields of politics, 
economics, international affairs, social relationship, literature, music, art, 
science, sports, and the like, his skill in oral and written expression, and 
above all, his understanding of and ability to work with boys and girls. The 
student who knows himself is prepared to give such information. He makes 
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sponsibility, of the evolution of the goals toward which he has worked and 
is still working, of how he has identified his strengths and weaknesses, of 
what he has done to grow and of how effective it has been. On the basis of 
this report, moreover, he could propose a program of continued growth in 
service. This plan would imply continued responsibility on the part of the 
institution from which the student is about to graduate as well, of course, as 
on the part of administrators and supervisors in the school system wherein 
he is eventually to be employed. His adviser and the several faculty members 
with whom he has worked most closely will go over this report with him, 
pointing out and discussing what they believe to be the weaknesses of the 
report—where the student has underestimated, overestimated, or ignored 
aspects of growth, and in general validating the report of the student against 
their own judgment. What could be more appropriate as a final evaluative 
activity in the pre-service program than such a self-appraisal and projected 
program for the future? 

Several questions may be asked about the hazards of self-evaluation. 
Does self-evaluation provide an undue temptation toward apple-polishing, 
cheating, and otherwise beating the game? The final report referred to above 
provides a last opportunity for the supervisor to bring the student face to 
face with his own intellectual honesty or lack thereof. In order for a teacher- 
student relationship to be a guidance relationship, the two must work closely 
enough together so that they really know each other. A student who cannot 
develop genuineness of purpose, identify his own strengths and weaknesses, 
face revealed deficiencies, or utilize them for future planning toward the 
realization of his goals, should be brought to realize that these are short- 
comings of the first order which make him unfit for the profession for 
which he is preparing. The procedure here described will result in apple- 
polishing, padding and whitewashing on the part of the student only if the 
relationship between him and his teacher has remained distant and artificial. 
People acquainted in the field of guidance know that each person is a unique 
individual and that it is more difficult to crib, fake or apple-polish with 
respect to one’s own uniqueness, than it is in the appraisal of activities based 
on a common textbook, syllabus or bibliography. 

Is there a grave danger that through self-evaluation the student may 
become too introspective? This question cannot be overlooked. We all know, 
within our own experiences or among our own acquaintances, of failure due 
to lack of introspection or self-analysis. Some people do not know they are 


r 
No.7 
sing 
for 
self 
1eld 
is - 
hen 
> 1S i - 
wth 
d ) 
” 4 
to 
vil] 
“d 
Cir 
4 
ty 
a 
| 
4 


536 JOURNAL OF EDUCATIONAL RESEARCH (Vol. 35, No.7 


out of a job until they are told. In contrast, there are a number of people 
who are extremely sensitive to their strengths and weaknesses, especially 
their weaknesses. They know all about them, but instead of using their 
knowledge to improve and to direct their activities toward the elimination 
of their weaknesses, they become further frustrated and disorganized as time 
goes on. Leadership on the part of the adviser or instructor is called for with 
respect to this point. A student should not be brought to identify his 
strengths and weaknesses until he is ready to do so. We know little about 
readiness for the development of self-evaluative skill in a variety of com. 
petencies, especially social ones. It would seem to be a very good guess, 
however, that when a student becomes genuinely interested in pursuing 
certain goals, educational and professional, he may be ready to face the 
competencies and potentialities demanded of him. 

But there is still another phase of the evaluation program so far as this 
individual is concerned. Will it be too much to hope that he will continue 
this process of self-evaluation after he has found a job, especially if he has 
guidance? Remember that he brings to his new responsibility a proposed 
program for continuous growth planned in the light of self-recognized weak 
nesses. This provides a basis for common effort on the part of teacher and 
supervisor or principal. It gives them some problems on which they may go 
to work together. This is an important point. People can talk frankly, and 
earnestly work together on issues so long as there is a spirit of confidence 
and mutual purpose. If the confidence is lacking the discussion will turn 
from issues to persenalities, and when this occurs the status of the super 
visor and teacher both are at stake. Little cooperative constructive work on 
the problems of any school system can be done under these conditions. 

What is required is that the supervisor-teacher relationship should be 
come a guidance relationship in which the teacher looks upon evaluation as 
an opportunity for him to receive help from the supervisor in identifying the 
strengths and weaknesses of the educational program of the school, and in 
identifying his own strengths and weaknesses as a teacher responsible for the 
improvement and operation of that program. The supervisor-teacher relation 
ship is one through which the teacher should receive cooperation, not neces- 
sarily “the answers,” in efforts to improve his and the school’s work (9) 

Our student, coming up through such a program as here described, wil! 
naturally not approach full-time responsibility in the profession with utter 
naivete. He will have had some experience with many types of professional 
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problems. Even so, he may find himself repeatedly frustrated and rather con- 
tinuously unhappy over the elusiveness of solutions to numerous complex 
problems. This person, in his relationships with supervisors and principal, 
will outgrow his difficulty or be guided out of the profession rather than 
kicked out. If he is truly guided out the spirit under which he goes will 
ward off many undesirable consequences. For the gross injury to the pro- 
gram of a school when a faculty member is dismissed does not arise in rela- 
tionship to the person dismissed, but rather from the impact on other 
teachers who are themselves not entirely secure in their positions. 

Most of the emphasis in describing this program of evaluation has been 
placed on process. What techniques and procedures will be used? Stand- 
ardized and teacher-made tests will be used—the former chosen, coopera- 
tively, by student and teacher; the latter developed by teacher and student 
when no standardized test can be found to meet the immediate need. Rating 
scales, questionnaires, and checklists will be used when and as they fill a 
felt need. And they will be more effectively used than under present pro- 
cedures, for most of the hazards will have been removed when the use of 
such instruments is limited to situations in which the student has real pur- 
pose. Anecdotes will be recorded by both supervisor and student. The close 
incidental working relationships and scheduled interviews will take on in- 
creased evaluative significance as both student and instructor become sen- 
sitized to real goals and purposefully planned experience. A first-ranking 
quality of staff members in schools of education and of principals and 
supervisors will be the ability to gain the confidence of those with whom 
they are working. 

Relatively little has been written about the use of formally constructed 
tests as instruments of self-evaluation. Several years ago when the writer 
was working with academically delinquent students in a rehabilitation course 
at the college freshman level, he gained his first insights into some of the 
merits of self-evaluation. Here were a group of students who had been 
placed in a course because they were in danger of flunking out of college. 
The purpose of the course was to salvage those students who had sufficient 
ability to improve their background of knowledge and skill to the point 
where they could succeed in college courses, or to guide out, rather than 
kick out, those students who proved to have insufficient ability to profit by 
college experience. Over a number of years the practice with these students 
upon entering the course was to give them survey tests in English, mathe- 
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matics, and reading, and questionnaires or checklists about health, study 
habits, emotional and social adjustments, and interests. This whole program 
consumed in all anywhere from eight to ten hours of class time. There was 
an effort to appraise the status of the student as quickly as possible, identify 
his strengths and weaknesses, plot them on a profile chart, diagnose his case, 
and prescribe remedial exercises quickly—all within the space of the first 
two or three weeks of the course so that the student might have the re 
mainder of the quarter to correct his weaknesses. The tests were quite good 
in and of themselves but, administered in this way, were, as far as the 
students were concerned, just more of the same kind of testing that they 
had had during high school and the first year of college. True, in each case, 
students were told why they were taking the test, but genuine motivation in 
the taking of tests is hardly achieved through the mere telling of purposes 
Most of these students had been told many times previously of the purposes 
and values of tests. 

After the first hour or two of testing, students’ attitudes toward the 
tests were such that in many instances the outcome was more an index of 
attitudes toward the proficiency or background skill rather than an indication 
of a deficiency or competency. Time after time, these students showed 
little interest in the results and little comprehension of the significance of 
the results for them. Consequently, they participated half-heartedly in the 
prescribed remedial exercises. 

Concern over ineffectiveness for these types of procedures led to the 
exploration of other approaches. Instead of putting the students immediately 
through a battery of survey and diagnostic tests, students were led to analyze 
through group discussions, the types of things that they needed to be able to 
do in order to succeed at the college level. This question of what are the 
tactors that make for success in college was discussed with increasing seti- 
ousness. With respect to reading, for example, they recognized the need for 
speed in comprehension. They should be able to see small symbols clearly, 
have a functional vocabulary, be able to differentiate between main points 
and detail. Then there are special types of reading—diagrams, charts, maps, 
gtaphs, and technical material. Students identified such competencies as 
adequacy of oral and written expression, and the supplementing knowledge 
and skills involved. 


As discussion of this kind proceeded, it was not long until, quite 
naturally and out of real interest, students began to ask: ‘Well, how can | 
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ell where I stand in connection with these skills and competencies compared 
with students who are generally successful in college?” The answer, of 
ourse, was: “We have some tests that have been especially prepared for 
that purpose. If you really want to know where you stand with respect to 
these competencies, we shall be glad to help you select the tests and use 
them. You may take the test according to directions and when you are 
through you will find keys which you may use in scoring your own papers. 
If you have difficulty in following the directions we shall be glad to help 
you. If you need help in working out your score, you may call on us. You 
may plot the results on your profile chart and we shall be happy to help 
you interpret them. If the test is a time-limit test, we will time it for you 
so that you need not be bothered by the time element.” 


Under these conditions students need relatively little attention of the 
monitoring type. Some, however, had come up through high school with 
their academic standing in occasional or chronic jeopardy and consequently 
had the well-known attitude toward tests—that of beating the game. Occa- 
sionally, a student, who—at one moment—wanted to find out where he 
stood with respect to various of these competencies, found himself, when 
in difficulty, tempted to look on the other person’s paper. It took a few 
reminders, in some cases, to the effect that if students followed that prac- 
tice they would identify the other person’s weakness rather than their own. 
This is an interesting point in the carry-over of strong attitudes and habits. 


When it came to scoring the test, students were encouraged not only 
to determine the numerical score but to analyze the results. For instance, in 
the language survey test, it was suggested that students, using a key which 
had been prepared in terms of actual corrections rather than numerical 
symbols, identify their errors and list them along with the correction in each 
instance. Wherever there was a duplication of error, they were merely to 
add a tally mark. By the time they were through with this invoice, they 
found that many specific errors could be classified into types—the non-dif- 
ferentiation between adjectives and adverbs, the disagreement of subject and 
verb, etc. When these errors were classified, students usually found that they 
could account for fifty per cent of their errors among three or four types. 

During the process of the analysis, students were brought to generalize 
on the nature of their errors. Many of these students had had few such 
experiences. They had memorized some other person's generalizations— 


Ki 
Study 
tam 
Was 
ntify 
Case, 
first 
00d 
the 
they 
ase, 
in 
ses 
Ses 
of 
on 
| 
) 


540 JOURNAL OF EDUCATIONAL RESEARCH {[Vol. 35, N 


either the instructor's, or those in bold-face type in some grammar or pocket 
handbook of English. These latter generalizations had been more or |ess 
memorized as nonsense syllables. But here the students were brought to 
generalize out of their own experiences. 

The checking of the papers, the actual marking of the errors, the 
examination of types of errors at the time of marking, the seeing of certain 
errors build up in number, and specific errors of the same type add up in 
frequency was an experience which made the whole process of evaluation for 
them one of extensive meaning and value. Especially was this so when they 
saw that if they concentrated their efforts on removing three or four types 
of errors, they would reduce the errors in their written expression by ap 
proximately fifty per cent. In most cases, it was a rather forthright step for 
these students at this time to ask, “What can I do to remove these defici 
encies?’’ Throughout this process, the instructor has played the role of re 
source person instead of prescriber, and the student under guidance has 
made progress in learning to identify his own strengths and weaknesses and 
to appraise his own improvement as he directs his energies toward the 
removal of weaknesses. Robinson (8) has developed a comprehensive guid 
to self-evaluation for use of students and instructors in college remedial work 

One further problem should be discussed at this point. If a student 
takes a standardized test when he is psychologically ready for it, that is, 
when it will serve what for him are real needs in identifying strengths and 
weaknesses, it is likely to interrupt mass test administration and possibly do 
insult to norms. 

But, teachers might also raise the question as to whether the frustra- 
tions that may be accumulated by students who repeatedly fall in the lower 
third of the class on standardized tests are compensated for through the help 
received from such tests. Assembling a mass of students in a room to take 
some standardized tests has not, in common practice, been preceded by ex- 
periences which have brought to each student a genuineness of purpose in 
taking the test. Some students may be artificially or externally motivated by 
fear of making a low score, some may be genuinely interested in the insights 
they may gain from taking the test, and others may take the test in quite a 
routine, docile, and unconcerned manner. It is conceivable that taking the 
test with these different degrees of motivation would result in nearly the 
same relative ranking of test scores as might be found if all students were 
genuinely self-motivated. But achieving a reliable ranking of students is not 
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pocket the only purpose which tests can serve, nor the only goal of education to 
it less which they contribute. 

ht to If we agree that one of the major goals of education is to develop in 
) the student competence in the identification of his own strengths and weak- 
nesses, we may raise several problems with respect to the use of standardized 


the 
ertain tests. Such problems as the following are involved: 
Ip in 1. Can students develop a sense of need for self-appraisal and a 
n for genuine purpose in taking a standardized test unless they have some part in 


they the selection of the test? Usually, teachers select the tests. Teachers deter- 
; mine whether or not a test is appropriate to the needs of a student. If the 
student were to participate in the selection of the test, would he not need 
» for to read a description of the test, consider the purposes for which it was 
ofici set up, and the types of evidence to be gained from it? He might even wish 
to examine a copy of the test. In so doing he would violate the conditions 
has under which tests are usually selected and administered, thus invalidating 
any comparison with the test norms. If he has difficulty finding out what the 
real purposes of various tests are, he might discover that Buros (1), in his 
Mental Measurements Yearbook, is rendering an outstanding service on this 


Lypes 


ap 


particular problem. 

2. Can the student develop an ability to identify his own strengths and 
weaknesses when the tests presumably designed to help him do this are 
scored by the teacher or a machine? Is it not more appropriate that he should 
score his own test and learn to analyze and classify the results himself ? 

3. Can the student gain really useful information in planning next steps 
in his educational program, unless he understands fully the implications of 
the test results? It is difficult to see how such an understanding can be 
gained without studying the test itself. But if a student studies a test he 
will become test-wise, and the norms on most standardized tests do not 
assume that the student has become test-wise. When the plan for an educa- 
tional experiment calls for administering Form A of a test at the beginning 
of a year and Form B at the end, the experimenter has usually assumed that 
ts he could not afford to let students learn anything about the tests except 
possibly that they made such-and-such a score. What may be needed is to 
have tests standardized originally on test-wise rather than naive students. 
If this were done, the results could still be used for purposes of comparison. 


e 
€ 4. Are standardized and other tests often given in such a way as to 
t remove from the student opportunity to develop intellectual honesty in 
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identifying and facing evidences of his own strengths and weaknesses? The 
great stress that has been placed on rigid procedures of administration that 
will make cheating on the part of students difficult or impossible is an ad. t 
mission of fear that many students taking these tests are doing so without § 
feeling that they will derive benefits immediately helpful or useful to them 
Such elaborate precautions might not be necessary if students were par y 
ticipating in the planning of an education program based on emerging goals 
meaningful to them, and were attempting to identify evidences of progress 
with respect to these same goals. 

For some time to come there will be situations in which some authority 
will need to choose, administer, score, interpret and use the results of stand- 
ardized tests. Among these are selective admissions to institutions, teacher J 
placement examinations, etc. Is it too much to hope however, that as educa. ' 
tion becomes more of a process of continuous self-appraisal, the number of 
evaluative relationships in which one person evaluates another will decrease 
and evaluation will proceed more and more through 2 guidance relationship, , 
in which the teacher or expert serves as a resource person helping the 
individual to identify and interpret evidences of his own strengths and 
weaknesses? 3. 


In the cooperative study in teacher education, teachers in service have 


shown an unwillingness to participate in evaluative activities which might 
possibly disturb either their economic or psychological security, or in evalua- 
tive activities which give no real promise of help with respect to their own 
growth. Those concerned with the evaluation of teacher growth have been 
forced to re-examine evaluative procedures in that light. Is it not appropriate 
that these same teachers, whether they be instructors in professional schools 
or public schools, should give as much attention to the feelings and security 
of boys and girls in the evaluative relationship with respect to heir growth 
as teachers demand from the evaluative relationship with respect to their g 


own growth? 
SUMMARY 
The challenge to increased attention to and experimentation with the 1¢ 
possibilities of self-evaluation in teacher education is presented with the 
following theses: 


1. Self-evaluation under guidance is consistent with a philosophy that 
honors the freedom and integrity of individuals. 
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2. Evaluative situations in which purposes, instruments, methods of 
administering, scoring and a are determined solely by 
someone in authority over the evaluated, are more appropriate under 
a dictatorship than in a democracy. 

3. The every-day decisions children and adults are called upon to make 
are largely based on evidence identified through self-evaluation. 

j. Self-evaluation is supporting to those learning situations in which 
the learner is self-propelled by realness of problem and genuineness 
of goal. 

5. If teachers are to help students develop increased competence in 

identifying their own strengths and weaknesses and in appraising 

their own progress, they should carry major responsibility in 
evaluating their own professional growth. 

Present known means of appraisal can be useful in self-evaluation 

if the student is seeking improvement toward goals which have real 

significance to him. 
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EVALUATION IN ART 


RAY FAULKNER 


Head, Department of Fine and Industrial Arts 
Teachers College, Columbia University 


Editor's note: One of the more difficult areas to which evaluation tech- 
niques have been applied is that of art. The author discusses the problems, 
issues, and advances in this area. 

For some eighty years, problems of evaluating art objects and processes 
have been given attention by a few psychologists and educators. A con 
siderable body of evidence concerning reactions to art objects and a few 
measuring scales and tests have resulted from this work. In addition, a host 
of issues and problems still to be studied have been raised. 


SCIENTIFIC INVESTIGATIONS AND TESTS 


Scientific studies of art preferences were begun by Fechner in the 
1860's in an attempt to discover if there were sets of proportions favored 
by the majority of observers. The results were inconclusive and, rather than 
leading to the discovery of any universal laws, emphasized the complexity 
of the problem and the variability of human reactions. Fechner did, however, 
lay the groundwork for scientific esthetics and developed the psycho-physical 
methods of choice or selection, adjustment or construction, and application or 
use. Later studies, such as those by Glascock, Cattell, and Washburn (13), 
Pintner (45), Williams (56), and Gordon (14) continued work of this 
character. 

The development of drawing scales grew out of the child study move 
ment. As early as 1893, Barnes (1) reported an extensive study of children’s 
drawings. It was only natural that there should be attempts at measurement 
in this area, and Thorndike’s “Scale for General Merit of Children’s Draw 
ings” (51) was published in 1913 and revised in 1923. The basis of merit 
in these scales is realistic representation. Two scales developed by Kline and 
Carey, (16, 17, 18, 19) measure achievement in representation and com 
position, respectively. The attention paid to composition, as well as literal 
representation, marks a forward step although the separation of these two 
integrally related aspects of conventional drawing is questionable. A “Draw 
ing Scale for Young Children’ was produced by McCarty (31) and pub 
lished in 1924, and the Providence Public Schools (47) issued ‘The 
Providence Drawing Scale’’ four years later. These scales have been critically 
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reviewed by Bird (2). The “Graphic Work-Sample Diagnosis’ (20) made 
a substantial stride forward in its avoidance of adult standards as the major 
riterion of excellence in children’s drawings. 

A second type of test has sought to measure art judgment. Such tests are 
often referred to as “tests of appreciation” with that term being defined as 

the capacity to recognize the comparative artistic merits of forms already 
created” (16, p. 8). Three such tests, in which the subject is required to 
crank in order of merit from two to five pictures, stand out in point of time 
and thoroughness of preparation. The “Meier—Seashore Art Judgment Tests” 

(32, 35, 36, 37) and the revised version called “The Meier Art Tests, 
|. Art Judgment’’ (38, 39) are based on the assumption that sensitivity to 
nd recognition of merit in composition or design in paintings are the major 

s in the field of art. The ““McAdory Art Test’ (29, 30, 50) has items 
based on differences in form, color, and notan. The primary criterion of 
excellence again is composition, but some of the items also show differences 
in expressive and functional qualities. The Christensen “Test of Art Ap- 

eciation’’ (5, 6, 7, 15), also based on differences in the design aspects 
of art, was published in an experimental form but was not made generally 
vailable. A test of art judgment directly applied to problems of house 
planning and furnishing, ‘The Minnesota House Design and House Fur- 
nishing Test’’ (3), requires the subjects to make judgments based on com- 
position, functions, and suitability. Critical comments on these tests have 

been made by a number of writers (10, 16, 33, 34, 49, 57). 

Those tests which seek to discover potential art talent of an order 
high enough to merit professional training are typified by the ““Knauber Art 
Ability Test’? (21, 22, 23, 24, 25), and the “Selective Art Aptitude Test’’ 
(52, 53) recently developed by Varnum. Each of these tests, which are in 
reality batteries of tests, require the subject to perform a number of tasks 
similar to or typical of those required of professional artists. 

Another phase of evaluation in art which desetves mention is that of 
evaluating progress of students in the schools. By far the most elaborately 
and carefully prepared tests for this purpose are the ‘Tests in Fundamental 
Abilities in Visual Arts’’ (25, 26, 27) prepared to measure progress accord- 
ing to the objectives of one school system. A description of the methods used 
in evaluating progress in a general college art course has been prepared by 
the writer (12). 
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Knowledge of facts and principles, in addition to being measured in 
two of the batteries of tests mentioned above, finds an increasingly important 
place in such tests as the “General Culture Test” (48), the “Cooperative 
Contemporary Affairs Test (9)”, and the “Graduate Record Examination” (4) 
The techniques are these generally found in typical objective tests. These 
are significant less because they contribute new ideas or suggest new direc. 
tions than because they are evidence of the recognition given to the role of 
art in our social scene. 

This, in brief, is the picture of some of the representative accomplish- 
ments in the evaluation of art activities and products. It presents a sub- 
stantial record of progress. The tests and investigations have been carried 
forward in accordance with the techniques of scientific educational research 
The tests have proved to be sufficiently reliable for many purposes and, 
within the philosophic framework on which they are based, are valid measur 
ing instruments. As pioneer work in a difficult field, they deserve high praise 
They have opened new paths of thought, and have given a general idea of 
the possibilities and limitations of the field. On the other hand, it is equally 
true that few of the art tests have lived up to the hopes and claims ad- 
vanced for them. With more enthusiasm than insight, some investigators 
have assumed that their test or tests would perform many functions for 
which they were not suitable. In an effort to establish something tangible in 
an area of activity usually vague, some workers have seriously over 
simplified the problem. The part has been mistaken for the whole, and 
details have been taken for essential factors. One is tempted to say that 
some of the work is pseudo-scientific, not because the investigations have 
lacked scientific rigor, but because the whole problem has not been deeply 
appreciated. This in no way denies the importance of past work, but rather 
calls attention to the subtlety and complexity of the problem. 


INFORMAL EVALUATION 


Most appraisals in art, however, are of an informal nature. The time 
consumed in administering and the expense of purchasing standardized 
tests are two factors limiting their use. Artists and art teachers also have 
been somewhat suspicious of “‘scientific tests." More significant, however, 
is the fact that few ready-made tests are exactly suited to the needs of 
varying situations. Although many of the same factors are regarded in 
informal and objective evaluations, in the former it is possible to make 
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idaptations and adjustments to the immediate set of conditions. Further- 
more, informal appraisal can take into account the relation of art progress 
to needs, interests, background, growth and development in a direct, 
functional manner. 

Informal evaluation most frequently suffers from lack of an organized 
program. Factors which should have little or no importance sometimes loom 
large because of their immediacy, and others of greater moment may for 
the time being be neglected. Such errors, however, are not inherent in the 
method, as is shown by the stimulating and comprehensive statement pub- 
lished by a committee of the Progressive Education Association (46, Chap. 
{), and in discussions by Whitford (55), Patzig (44), and Vaughan (54). 


MAJOR ISSUES 


Any consideration of the general problems of evaluation in art leads 
inevitably to a consideration of the nature of art activities, for without the 
clearest possible understanding of what one is trying to measure all the 
techniques and devices known are useless, or worse, definitely misleading. 

The Subjectivity of Art. The major problem centers around the sub- 
jectivity of art, and, consequently, its relation to science and the scientific 
method (40). On this question there is little agreement. Some believe that 
art and science have little if anything in common, that the methods suitable 
to one are without value to the other. Another point of view is that the two 
are so markedly similar that there is no essential difference. A third belief 
(to which the writer adheres) is that all human activities have both common 
and differentiating characteristics. For example, to say that science is objective 
and art is subjective is a dangerous over-simplification because it is not an 
“either-or’’ matter. Some aspects of science are subjective, some aspects of 
art objective. The experiences of creative artists and others sensitive to 
esthetic values, however, leads overwhelmingly to the conclusion that the 
subjective plays an exceedingly important role in the arts. 

This in no way implies that art is mystical or supernatural. It is to say 
that in the realm of esthetics the feelings aad emotions have a predominant 
role, bringing about a unitary, organic response not often found in the 
sciences. Although creative endeavors in the two fields have much in com- 
mon, they are certainly not identical. In general, the scientist seeks to discover 
existing relationships in man and in nature while the artist seeks to create 
new relationships, but there is seldom a clear-cut distinction between the 
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two. When the scientist seeks to create new relationships, his efforts are 
artistic, as in the development of new plastics; and when the artist seeks 
to discover existing relationships, as the discovery of the laws of perspective 
by the early Renaissance painters, his efforts are scientific. The methods of 
working are also different. The scientist depends largely on analysis and 
isolation of parts, while effective art effort depends more on an intense 
wholeness of approach. The products likewise show differences as a com- 
parison of a mathematical formula with a painting will demonstrate. The 
solution to a complex mathematical problem can often truly be called 
“beautiful,” but it carries in itself none of the mathematician’s personal 
experiences which went into its production. The effectiveness of a painting 
in contrast, depends largely on the degree to which it carries directly to the 
observer the painter's vivid thoughts and feelings that went into its 
production. 

To date most of the work in evaluation in the arts has followed the 
scientifically logical procedure of isolating measurable traits and developing 
suitable instruments. The traits selected have generally been those most 
amenable to research techniques rather than those which are most central 
to art activities. Consequently, the periphery rather than the core has re 
ceived most attention. As studies progress it is inevitable that more and 
more traits will be isolated and measured with the result that many which 
we now call subjective will come to seem more objective. At present, how- 
ever, the more personal and subtle—and perhaps the most important— 
aspects have not been dealt with adequately. 

The Complexity of Art. When our understanding is complete, it is 
possible that art will be proved to be no more complex than any other 
phase of human effort. Be that as it may, at present there are countless 
unsolved problems and debated issues which are complicating factors. Art 
has many definitions and many approaches, on none of which is there 
unanimity of agreement. 


Art is sometimes considered primarily as process, an emphasis carried 
forward by present day progressive educators. When considered from this 
point of view, important questions are: What happens to a person while 
he is creating or appreciating an art object? What changes, temporary or 
permanent, result? Art as process is closely related to personality growth and 
total development, throwing attention on esthetically growing individuals. 
Broad generalizations on group achievements, or scores on tests measuring 
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one small aspect of art, are of interest but not great consequence to an 
educator who sees art in this way. 

Art may also be considered as skill, raising the questions: What skills 
have been developed or improved, and what is the extent of the improve- 
ment? The acquisition of techniques, usually of a highly specific nature, and 
of craftmanship becomes a major consideration, more important often than 
the artistic merit of the object produced or than the esthetic growth of the 
individual. The drawing scales and some units of the aptitude tests are 
based on this concept of art. 

Art as knowledge stresses the importance of names, dates, and his- 
torical developments, such as is emphasized in most history of art depart- 
ments at the college level. This phase has been conveniently measured by 
the conventional objective testing techniques. 

Art as creative originality calls attention to inventive vigor and unique- 
ness. The major question is: To what degree is the creation or appreciation 
of a work of art the result of fresh, unique experiences? It suggests devia- 
tion from rather than conformity to established rules or opinions. Because 
scales and tests normally depend on comparison of the subject’s performance 
with previously established standards, this aspect of art is exceedingly difh- 
cult to measure. Obviously, standards for uniqueness cannot be set up in 
advance except to show what is conventional or typical. The measurement 
thus becomes negative rather than positive; if the seaction differs from the 
ordinary or expected, it is creative. To what degree it is creative and whether 
or not the creativity is in a desirable direction are problems which as yet 
no measuring device has found a way to quantify. The problem is further 
complicated by the fact that creativity can be assayed from at least two points 
of view. One is the comparison of a reaction or product with those of other 
persons, while the second is to compare it with previous responses of the 
same individual. So far there has been little progress along these lines. 

Art can also be considered as product, such as has been done in most 
of the art judgment tests constructed to date. The major question becomes: 
What is the relation of one art product to similar objects? One painting or 
one building, for example, is assumed to be definitely superior (presumably 
under all conditions and for all observers) to another painting or building 
with which it is compared. Each work of art is thus assumed to have a 
definite, objective value more or less independent of the culture which pro- 
duced it and the experience of the observer who views it. When art is 
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regarded in this light, the individual tends to fade into insignificance in the 
face of the opinions of experts and the long-term judgments of history. Ar 
tests, based on this point of view, ask the subject to pit his judgment against 
that of experts, but on one criterion only: recognition of merit. The depth 
or intensity of response, the richness of personal experience lying behind 
the choice, and the reasons for making the selection are generally ignored 

The emphasis on art as product can be further subdivided. Art objects 
can be viewed primarily as compositions or design in form, color, space, 
line, and texture; in other words, the formal or design aspect is deemed most 
important. The Meier—Seashore Art Judgment Test and the Christensen Art 
Test belong in this category. Other tests, such as that developed by Dr 
Paul Diederich called “Seven Modern Paintings’ (8) and some of the tests 
developed by the writer (11), are focussed on expressive considerations. Cul 
tural or contextual relations play a part in the “Minnesota House Design and 
House Furnishing Test’’ (3) and in the “Faulkner Congruity Test” (11) 

Other approaches to art could be mentioned, but these are sufficient 
to give some idea of the ways in which evaluation can be approached 
There is little data on how closely these approaches are related to one 
another, and there is far from complete agreement on their relative im- 
portance. With the exception of the batteries of tests, the majority of art 
measuring instruments have been confined to one interpretation of the nature 
of art. This, in part at least, explains the low correlations usually found 
among the results of different tests, and seems to preclude the possibility of 
getting one test which will give a valid indication of a person’s appreciative 
or creative ability. 

SUMMARY AND CONCLUSIONS 


Educators are sincerely interested in determining the amount and direc- 
tion of their students’ progress in art, and are constantly looking for ways 
of determining these results more effectively. Counsellors and guidance ex- 
perts are hard pressed for means of discovering art aptitudes. Are there tests 
which will measure student progress and which will give valid indices of art 
aptitude? No general answer can be given, but the following statements can 
be made. 


Drawing Scales: There are drawing scales (18, 31, 51) which measure 
the development of drawing ability of children according to realistic rather 
than expressive standards, and one (20) which shows the relation of each 
child’s abilities to those of other children. 
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Art Judgment Tests: There are several tests of art judgment (3, 29, 36, 
38) which show how closely a person agrees with the opinions of a limited 
number of experts regarding the merit of certain art objects. Whether or 
not any one of these measures gives a true and complete index of a person’s 
‘ensitivity to art values is still open to question. The fairly low correlations 
among these tests (11) indicates that no one of them measures a general 
ability to judge art (if there is such a general ability). Furthermore, there 
is good reason to believe that the relation between creative art ability and 
art judgment as measured by these tests is low, making them of very limited 
value in predicting success in professional art studies. 

Achievement Tests: There is at least one battery of tests (26) designed 
specifically to evaluate achievement in terms of the objectives of one public 
school art program. Within these limits, the tests are valid. 

Aptitude Tests: There are two published aptitude tests (22, 52) de- 
veloped specifically to predict future success in art. Although these tests are 
worthy efforts, they need to be used more widely under varied conditions 
and the results checked against careful follow-up studies before positive 
statements concerning their value can be made. 

Informal Evaluation: An informal, subjective approach to the evalua- 
tion of art aptitudes and achievements is, and undoubtedly will continue 
for some time to be, the most widely accepted method. The majority of 
artists and art teachers, depending as they do on subjective judgments in 
their own creative efforts, have considerable faith in them, and have found 
no test or tests which effectively and completely displace them. In informal 
evaluation a comprehensiveness of judgment can be achieved that is as yet 
impossible with standardized tests alone. Many educators have found, how- 
ever, that the use of some standardized measures is a valuable supplement 
to their own judgments. 
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EDITORIAL 


THE SYMPOSIUM ON EVALUATION 


THE idea of devoting a special issue of the Journal of Educational 
Research to evaluation originated as an outgrowth of hearing a number of 
papers presented by members of the various educational associations which 
met at St. Louis in 1940 and at Atlantic City in 1941 at the time of the 
conventions of the American Association of School Administrators. As one 
listened to the presentations and heard frequent mention of the term 
evaluation, he became aware of the fact that a new concept was being added 
to the galaxy of educational terms. Since the use of this term suggested no 
standardization of meaning and since the implications of the term suggested 
a new phase of educational thinking and research, the Board of Editors of 
the Journal approved the writer's proposal for devoting a special issue to 
the general discussion of the concept and to the presentation of efforts at 
application of the principles of evaluation in sample fields of educational 
endeavor. In approving the proposal, the Board of Editors noped that such 
an issue would acquaint its readers with the philosophy underlying the term 
evaluation, with some of the instruments and techniques employed, and with 
some of the problems involved in efforts at making satisfactory evaluations. 


No attempt was made in this special issue to have a number of con- 
tributors discuss the meaning and significance of the term evaluation, nor 
to cover all fields to which the term might have application. Rather, it was 
decided to set forth something of the evolution of the term, its definition 
and underlying assumptions and specific applications to fields involving the 
appraisal of programs of education or of subtle educational values. Although 
the writer submitted the proposed outline of the series to all the con- 
tributors for comment or modification, he must assume responsibility for 
any shortcomings in the general planning of this issue. The contributors 
graciously prepared articles on the topics suggested by the writer. It might 
have been possible to have had a more integrated issue had time allowed 
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each contributor to have read all the articles submitted and to have made 
any desired modifications. 


While the issue is limited in scope, no one can read the articles without 
being impressed with the fact that those who assume responsibility for 
making evaluations are well aware of the complexity of the process. Through. 
out the different articles there is either a definite statement or a subtle 
assumption that a program or processes of education must be evaluated in 
terms of the purposes which the program or process was set up to attain 
This assumption thus presupposes a definite formulation of the values to be 
attained. All of the contributors illustrate the multiplicity of the values 
involved, the complexity of the task and the variety in the types of data 
needed for a satisfactory appraisal. Almost all the contributors point to the 
contributions of early efforts at measurement and appraisal, but at the same 
time indicate the limitations of these efforts and point to the new types of 
research which must be undertaken if completely satisfactory evaluations are 
to be made. 


One of these new fields concerns the need for evaluation in terms of 
the purpose of the individual learner and in terms of the process of learning 
as it affects the learner. Faulkner, Raths, Troyer and possibly others indicate 
that in the past many efforts at measurement have been by means of instru- 
ments constructed with current social values and purposes in mind, but 
which have been administered without much consideration of the learner's 
purposes and without cognizance of the effects of a given experience on the 
process of the individual learner. These contributors render a real service in 
outlining a fruitful field for future research. 


Woopy, 
School of Education, University of Michigan. 
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“—— APPRAISING CHANGES IN VALUES OF COLLEGE STUDENTS 
without Louis RATHS 
ite Ohio State University 
ity for 
1rough- Editor's note: Evaluation is a many sided complex activity. The author 
subtle discusses problems, issues and techniques in the evaluation of the activities 
ited in of college students. 
attain Most of the existing instruments for appraising the values of college 
5 to be students seem to be based upon the assumption that human personalities 
values are not infinitely varied. The makers of instruments for the appraisal of 
f data values begin their tasks with a series of value categories in mind. They 
to the proceed to formulate statements which have bearing on these categories. 
‘same [| To the degree that a student accepts, approves, prefers, chooses certain state- 
Des of ments it is concluded that he cherishes the ‘‘value’’ with which these state- 
ns are = = ments are associated in the scoring key. 
| In one instrument he may be confronted with a series of situations 
ns of which require him to select leisure time activities. If he consistently chooses 
rning to elect those of a musical nature he is said to place value on music, to hold 
dicate music as a value. On an attitudes test he may agree with a number of state- 
istru- ments favoring freedom of speech, of the press, of religion, of the rights 
. but of assembly, and as a consequence these values are attributed to him. Some- 
ner's times a test maker synthesizes a number of categories into a larger one and 
1 the attributes “liberalism’’ or “conservatism” or “‘radicalism’’ to students. 
aim | Some examinations relate to current social problems and the student is 
) asked to choose from a number of alternative solutions the one which he 
prefers. This is followed by a direction to choose from a list of ‘‘reasons”’ 
7 . those he would elect to support his solution. As in the former example, 
the test maker has formulated the solutions in terms of certain value- 
categories and has designed the list of ‘‘reasons’’ in similar fashion. When 
a student consistently chooses conclusions and reasons associated with one 
of the test-maker’s value categories, he is said to hold this value. This infer- 


ence is reinforced if the student, in addition, consciously rejects statements 
| relating to an opposing value. 
Instruments constructed according to this general formula are now quite 


the style, and they are indeed ingenious gadgets. In the limited space avail- 
able, mention will be made only of those instruments which are directly 
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concerned with “values.” The Allport-Vernon' instrument, A Study of 
Values, was constructed on a theory of types of men projected by Eduard 
Spranger and inferences are drawn from each student's test responses con. 
cerning these values: theoretical, economic, aesthetic, social, political, and 
religious. 

Maller and Glaser* published in 1939 The Interest-Values Inventory 
which is based upon the work of Allport and Vernon and, in part, on Thur. 
stone’s factor analysis of the Strong Vocational Interest Blank. The direc- 
tions to the students are significantly different, the content is different, and 
the value-categories have been reduced to four: theoretic, aesthetic, social, 
and economic. 

The Evaluation Staff* of The Progressive Education Association has 
developed some instruments involving the solution of social problems and 
interpretations of student responses are related to the following value- 
categories: human values, property values, democratic rights and _privi- 
leges, individualism, cooperative group action, and values which approve 
compromise. 

Mooney* at The Ohio State University has experimented with problem 
check lists at the college level and the students’ worries, as reported on the 
blanks, are inferred to be associated with values. His categories include: 
health and physical development; finances, living conditions, and employ- 
ment; social and recreational activities; social-psychological relations; per- 
sonal-psychological relations; courtship, sex, and marriage; home and 
family; morals and religion; adjustment to college work; the future—voca- 
tional and educational; and curriculum and teaching procedures. 

In 1941, Harding developed two values tests at Ohio State University 
One is entitled Value-Type Problemmaire, and the other Value-Type Gen- 
eralizations Test.® The value categories underlying the construction of these 
two instruments follow: democracy, authoritarianism, naturalism, transcen- 


*Gordon W. Allport and Philip E. Vernon. A Study of Values. New York: 

Houghton Mifflin Company, 1931. 
. B. Maller and Edward M. Glaser. The Interest-Values Inventory for High 

School and College Students and Adults. New York: Bureau of Publications, Teachers 
College, Columbia University, 1939. 

* Progressive Education Association. Evaluation in the Eight-Y ear Study. Chicago: 
University of Chicago. 

*Ross L. Mooney. Bureau of Educational Research, Ohio State University, Colum- 
bus, Ohio. 

* Lowry W. Harding. Value-Type Problemmaire and Value-Type Generalizations 
Test. Columbus, Ohio: College of Education, Ohio State University, 1941. 
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dentalism, socialization, personal security, progress, status quo, activism, 
passivism. Both tests are scored in terms of these values.° 

For use at the college level there are many other instruments relating 
to attitudes and to interests, and these may be thought of as “values” tests 
just as appropriately as those already mentioned. The point to observe is 
that all of these instruments are based upon certain fixed categories of 
values. These categories have been “‘set’’ by the test maker. Each response 
that a student makes is mechanically related to a value-category. In some 
tests it is an either-or type of relationship. In other instruments the student 
can assign his own weights to a statement and inferences are drawn from 
these weightings. In every case “values” are attributed to a student in terms 
of certain selections made from a very limited number of alternatives set 
by the test maker. 

A study of the available instruments raises some very complex questions. 
What are values? Are they*to be thought of as comprehending every single 
thing that any person prefers, cherishes, desires, holds dear? By values do 
we mean also every single thing that any person rejects, abhors, holds in 
disesteem? Do we include in our meaning of the word the variations asso- 
ciated with space, time, people? For it is quite obvious that we place differ- 
ent values on things depending upon when, where, with whom the valuing 
act takes place. Moreover, the means available to achieve our goals often 
play a very significant part in our value schemes. If we can secure a goal 
only by certain means which we abhor we are quite likely to reject the goal 
at that particular moment. 

Are personalities infinitely varied or is there to be found some scheme 
into which a very large number of our college students may be classified? 
Are there characteristic “ways of living” and can we find value schemes 
which represent the activities and motivations of men and women? Granted 
that there are such things as personal outlooks on life, do these points of 
view tend to be guides to the behavior of an individual? Or is the behavior 
of most individuals so infinitely varied that at different times and in different 
situations it is motivated by a great number and variety of values, often in 
conflict with each other? Can we go ahead rather safely on the assumption 
that for practically all individuals there are certain “central tendencies” 
which guide thought and action? What are the origins of these tendencies 


_ “Lowry W. Harding. The Place of Value in the Education of Teachers. Disserta- 
tion, Ohio State University, Columbus, Ohio, 1941. 
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or dispositions? How are they developed? How can they be identified? 
These are some of the questions underlying the appraisal of values. 

If there are ‘‘central tendencies,’ certain ‘dispositions’ to act in con 
formity with a pattern of values, how may this be explained? In a given 
situation, one may act ever so kindly, protecting human values and placing 
them highest in that particular context. Can it be said that a person so 
acting brought this value to the situation? It can be assumed rather rea- 
sonably that every person has been kind and unkind to other people and 
if he brings anything at all to a situation he surely can bring both positive 
and negative values to bear. Why then, in any particular situation, or in a 
number of situations, do we say that an individual Sas certain values? 

Suppose, however, that we take the position that every situation is 
unique. An individual in a situation takes all or selects some of the resources 
entering into that situation and makes something out of those resources. 
At a certain remark he becomes angry. Before he became a part of that 
situation he was not angry. He may have brought “anger” with him but 
it was not present before that particular remark and as was said before he 
probably brought pleasantness to the situation too. How explain that in 
this context of people and time and place he utilized the resources at hand, 
to make an anger response? The significance of the response may have been 
at a low level of consciousness or may have been unconscious. Something in 
the situation, in the relations between the individual and his environment 
provoked anger. This response, it has been assumed, is in some way asso- 
ciated with the values held by the angry person previous to that particular 
moment; these values were threatened in some fashion: ridiculed, rejected, 
scorned, etc. Instead, however, of assuming that the angry man 4eld these 
values previous to the situation, isn’t it just as reasonable to assume that he 
created the value in that situation, that his very response created the value 
for him and created it also for other people listening and observing to 

approve or disapprove, to accept or reject, to cherish or hate? 

To believe that values are thus continually created in all situations does 
not prevent us from generalizing about the trends of an individual's be- 
havior. Instead of postulating that a person has certain values, we form- 
ulate our statement to say that he tends to make certain values out of the 
situation. In other words, as a person continues to make values which bear 
a close resemblance to each other we are in the habit of saying that he Aas 
those values. If we are not careful we shall think of these values which we 
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attribute to him in a rigid sort of way. We shall think we have a sound 
basis for predicting what values he will express in a future situation. If, 
however, we recognize how influential the ingredients of any one situation 
cre. if we come to see that values are created in the situation, we shall be 
content with predictions that are very much qualified, that will probably 
be much nearer the truth, and we shall save ourselves much of the dis- 
appointment and discouragement which are the consequences of rash hopes 


or ill-founded plans. 
If we take the position that personalities are infinitely varied and that 


iny one personality is continually creating values out of the materials at 
hand and that the latter are significant in their influence, what leads do we 


have for the appraisal of values? Gordon W. Allport’ has stated his position 
very Clearly: 

“I realize my proposition that human motives are infinitely varied 
will startle and dismay many social scientists and educators. ‘How on 
earth,’ they will ask, ‘can we work with such an anarchic doctrine?’ They 
will argue that they must know what the basic drives and needs of men are 
in order to understand, satisfy, or train them; and that planning for social 
betterment is not possible without a uniform — of the fundamental 
goals for which all men strive. They would prefer to invoke by rule of 
thumb McDougall’s 18 propensities or Murray's 28 needs, or, still better 
since they are busy people, Thomas's four wishes or the simplex dynamisms 
proposed by Freud, Adler or Marx. 

“My objection is that all such schemata are too abstract to apply to 
living men. Their abstractness as well as their arbitrary nature is attested 
by the failure of psychologists and biologists to agree upon either the nature 
or the number of these allegedly fundamental drives. 

“But you counter, ‘Do not all normal men want, let us say, food, 
personal security, and sex satisfaction?’ I reply: Are any two lives precisely 
alike in their expectancies or desires concerning security, sex, or even food? 
What men really desire is their own peculiar and individual forms of security 
and satisfaction. Even in respect to the ‘basic food drive’ I would say: the 
need for nourishment never stands alone; tastes, standards, and attendant 
images differ, and what is meat to one is poison to another. The object of 
the desire is integral with the want.” 


Does it follow that no appraisal of values is possible? Not at all. We 


must give up this notion of a pre-determined, universal scheme or schemes 
into which we shall classify our fellow men. Experience indicates the futility 


*Gordon W. Allport. “Liberalism and the Motives of Men.” Frontiers of De- 
mocracy, Pebruary 15, 1940. P. 136. 
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of such efforts and we are coming to believe that the job cannot be done 
that way. Even if agreement could be secured on a values scheme, even 
gtanting its possibility, current trends in evaluative techniques indicate that 
appraisals would take the form of allowing students to express themselves 
only in terms of the scheme laid down. Grant that human personalities are 
infinitely varied and it follows that these testing procedures would do 
violence to the appraisal of the values of many personalities. Such testing 
would result in attributing certain values to the students and these would 
have to be derived from the alternatives offered in the tests. 

In the opinion of the writer, individuals do “teke’’ values to a situa- 
tion. These values, however, are not fixed principles which determine con- 
duct. These ideas, principles, values become part of the resources of the next 
situation and the behavior in this new situation is created out of these 
resources. These newly created behaviors are not the exact equivalents of 
the values brought along. What is created is something different. From 
what is created an observer may abstract or synthesize something which 
he terms a value. But these generalizations or abstractions again will not 
be applied exactly so in future situations. This continual variance of ‘values 
in action” confirms the opinion that dependence upon tests with their 
limited, fixed categories and their artificiality so fat as life functioning is 
concerned will not lead us far. What are some of the important require 
ments of a more valid appraisal of values? In the first place, there must be 
an extraordinary freedom of choice to the student. The testing situations 
must be indeterminate. Different students with different backgrounds of 
experience must have opportunities to express their differences. In the 
second place, we mustn't expect that all students will be able to make values 
in situations restricted to verbal activity. Some will create values more clearly 
in song and story, with paint and brush, with wood or metal, and in many 
other ways. These varieties must be provided. 

In the third place, we can expect more values to be created in situations 
which involve some “sizing up,” some planning, some projection of activity 
over a period of time. Under these circumstances many more choices must 
of necessity be made if the planning is at all comprehensive: choices of 
final products, choices of procedures to follow, of resources to utilize, of 
consequences to foresee, of relationships to other activities. In such a com- 
plex of behavior, the student will probably define some of his values in 
pairs: the thought of one which is prized will suggest another to be avoided. 


March, 194 


Fou! 
prizes thi 
tives ava 
of value. 
of place, 
greater V 

In t 
values by 
used by 
n a stu 
research 
analyzin, 
The fac 
volved, 
mention 

s much 
verbal. | 
nuch Ca 

Th 
fifth re 

rather t 
purpose 
between 
may be 
part of 
that he 
work, a 
student 
values 

basis fi 
growth 


lL 
Bulletin 


hm 
Tests,” 
Ja 
XIX. C 


March, 1942} CHANGES IN VALUES 563 


Fourthly, a good appraisal will avoid the rash conclusion that a student 
prizes that which he did in a particular situation select from all the alterna- 
tives available. Selections are significant but not necessarily determinative 
of value. An individual may postpone a greater value. Exigencies of time, 
of place, of obligation may influence a decision. The overt act may conceal 


greater values. 

In the opinion of the writer much will be gained in the appraisal of 
values by the further study and experimental use of the projective techniques 
used by Murray.* The exploratory work of Grimes® in studying the values 

1 a student's art work is suggestive and most certainly worthy of further 
research in this area. The writer's’? own experience with a method of 
analyzing the reports of student observations seems to merit further study. 
Che fact that these techniques have not been widely used, are rather in- 
volved, and have been rather fully described elsewhere may justify mere 
mention of them here. They do satisfy the criteria thus far presented. There 
; much freedom, planning is involved, and responses are not necessarily 
verbal. Their effective use requires great skill on the part of a teacher and 
nuch caution in interpretation. 

The latter two have a characteristic which deserves special mention as a 
fifth requirement in an appraisal of values. They constitute a procedure 
rather than an instrument. The student's contribution is analyzed for the 
purpose of tentatively ascribing values to him. There follows a discussion 
between instructor and student where the values thus tentatively attributed 
may be clarified. The test here is part of a larger process and is an organic 
part of every day experience. The student's work was not stopped in order 
that he might take a test on values. Instead values are associated with his 
work, are part of it, and consciously related to it. In terms of a particular 
student, at that particular time, it may be unwise to acquaint him with ‘‘the 
values he expressed.” The value data in every case, however, may be the 
basis for formulating hypotheses about next steps in the guidance of the 
growth and development of these students. 


*"H. A. Murray. “Exploration of Covert and Unconscious Themas: Projection 
Tests,” Explorations in Personality. London: Oxford University Press, 1938. P. 529. 

* James W. Grimes. Values in a Work of Art. Educational Research Bulletin, 
XIX. Columbus, Ohio: Ohio State University, 1940. 

* Louis Raths. Approaches to the Measurement of Values. Educational Research 
Bulletin. Columbus, Ohio: Ohio State University, 1940. 
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The procedures suggested by Murray, Grimes, and Raths have a second 


characteristic in common. The student is not limited to a series of choices 
determined by the test maker. Instead, he writes freely or draws freely. The 
summary description of the student's values are drawn from Ais expression 
This is not to imply that the interpreter does not have categories in mind 
as he summarizes. Nevertheless, the student has much more freedom to ex- 
press himself and the tester, faced with the improbability that every student's 
responses will fit into some preconceived series of categories, must summarize 
each student's work in terms of categories suggested by his particular 
responses. 

What is the writer interested in so far as values testing is concerned? 
He is concerned that students should be engaged in the process of examining 
their own values. Evaluation in this area, as he sees it, should take the 
form of contributing in such ways as will aid students to see themselves 
more clearly. How can this be done? The primary task in the educational 
situation is to bring the student and the instructor together in circumstances 
where they will both be carrying on a cooperative inquiry about matters that 
are ery important to the student and to the instructor. 

In other words, in this business of appraising values, the writer's faith 
lies not in ingenious schemata, nor in tests as we now conceive them, but in 
the possibility of reconstituting our educational situations toward the end 
that they might indeed be value experiences for students, in the intelligence 
of those who are questing for clarifying their own values, and in the con- 
tinuing worth at all times in the sharing processes which are a part of co- 
operative inquiry. In other words, it is to the process of clarifying values 
that we must give greatest allegiance in our schools. Fire, flood, famine and 
the sword, the press, the radio, the movies and perhaps television, the 
capitalists and the workers, the Socialists and the Communists, Republicans 
and Democrats, the church and the schools, all operate to influence our 
behavior. They have an influence on the values which we will create in a 
particular situation. With such forces impinging, fixed value schemes are 
surely nonfunctional. Our faith is better justified in clarifying the valuing 
process. We must develop ways of identifying values expressed in action 
in a social system where change is the only constant. We must bring to 
all students an intellectual awareness of the values they are creating in their 
daily activities. This clarification is a necessary preliminary to the securing 
of a better society. 
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